[flang-commits] [flang] [flang][debug] Avoid redundant debug data generation for derived types. (PR #124473)
Abid Qadeer via flang-commits
flang-commits at lists.llvm.org
Sun Jan 26 08:47:34 PST 2025
https://github.com/abidh created https://github.com/llvm/llvm-project/pull/124473
Since https://github.com/llvm/llvm-project/pull/122770, we have seen that compile time have become extremely slow for cyclic derived types. In #122770, we made the criteria to cache a derived type very strict. As a result, some types which are safe to cache were also being re-generated every type they were required. This increased the compile time and also the size of the debug info.
Please see the description of PR# 122770. We decided that when processing `t1`, the type generated for `t2` and `t3` were not safe to cached. But our algorithm also denied caching to `t1` which as top level type was safe.
```
type t1
type(t2), pointer :: p1
end type
type t2
type(t3), pointer :: p2
end type
type t3
type(t1), pointer :: p3
end type
```
I have tinkered the check a bit so that top level type is always cached. To detect a top level type, we use a depth counter that get incremented before call to `convertRecordType` and decremented after it returns.
After this change, the following [file](https://github.com/fujitsu/compiler-test-suite/blob/main/Fortran/0394/0394_0031.f90) from Fujitsu get compiled around 40s which is same as it was before #122770.
The smaller testcase present in issue #124049 takes less than half a second. I also added check to make sure that duplicate entries of the `DICompositeType` are not present in the IR.
Fixes #124049 and #123960.
>From 468e84801138141ff4b4affab9bf687dad0c9f4a Mon Sep 17 00:00:00 2001
From: Abid Qadeer <haqadeer at amd.com>
Date: Sun, 26 Jan 2025 16:18:03 +0000
Subject: [PATCH] [flang][debug] Avoid redundant debug data generation for
derived types.
Since https://github.com/llvm/llvm-project/pull/122770, we have seen
that compile time have become extremely slow for cyclic derived types.
In 122770, we made the criteria to cache a type very strict. As a result,
some types which are safe to cache were also being re-generated every
type they were required. This increased the compile time and also the
size of the debug info.
Please see the description of PR# 122770. We decided that when
processing t1, the type generated for t2 and t3 were not safe to
cached. But our algorithm also denied caching to t1 which as top level
type was safe.
type t1
type(t2), pointer :: p1
end type
type t2
type(t3), pointer :: p2
end type
type t3
type(t1), pointer :: p3
end type
I have tinkered the check a bit so that top level type is always cached.
To detect a top level type, we use a depth counter that get incremented
before call to convertRecordType and decremented after it returns.
After this change, the following file from Fujitsu get compiled under
a minute.
https://github.com/fujitsu/compiler-test-suite/blob/main/Fortran/0394/0394_0031.f90
The smaller testcase present in issue 124049 takes around half second. I
also added check to make sure that duplicate entries of the
DICompositeType are not present in the IR.
Fixes #124049 and #123960.
---
.../Transforms/DebugTypeGenerator.cpp | 30 +++++++++++++++++--
.../Optimizer/Transforms/DebugTypeGenerator.h | 1 +
.../debug-cyclic-derived-type-3.f90 | 8 +++--
3 files changed, 34 insertions(+), 5 deletions(-)
diff --git a/flang/lib/Optimizer/Transforms/DebugTypeGenerator.cpp b/flang/lib/Optimizer/Transforms/DebugTypeGenerator.cpp
index 555f354521c9b7..cdd30dce183dd5 100644
--- a/flang/lib/Optimizer/Transforms/DebugTypeGenerator.cpp
+++ b/flang/lib/Optimizer/Transforms/DebugTypeGenerator.cpp
@@ -48,7 +48,8 @@ DebugTypeGenerator::DebugTypeGenerator(mlir::ModuleOp m,
mlir::SymbolTable *symbolTable_,
const mlir::DataLayout &dl)
: module(m), symbolTable(symbolTable_), dataLayout{&dl},
- kindMapping(getKindMapping(m)), llvmTypeConverter(m, false, false, dl) {
+ kindMapping(getKindMapping(m)), llvmTypeConverter(m, false, false, dl),
+ derivedTypeDepth(0) {
LLVM_DEBUG(llvm::dbgs() << "DITypeAttr generator\n");
mlir::MLIRContext *context = module.getContext();
@@ -407,7 +408,10 @@ mlir::LLVM::DITypeAttr DebugTypeGenerator::convertRecordType(
/*baseType=*/nullptr, mlir::LLVM::DIFlags::Zero, offset * 8,
/*alignInBits=*/0, elements, /*dataLocation=*/nullptr, /*rank=*/nullptr,
/*allocated=*/nullptr, /*associated=*/nullptr);
- if (canCacheThisType) {
+
+ // derivedTypeDepth == 1 means that it is a top level type which is safe to
+ // cache.
+ if (canCacheThisType || derivedTypeDepth == 1) {
typeCache[Ty] = finalAttr;
} else {
auto iter = typeCache.find(Ty);
@@ -663,7 +667,27 @@ DebugTypeGenerator::convertType(mlir::Type Ty, mlir::LLVM::DIFileAttr fileAttr,
return convertCharacterType(charTy, fileAttr, scope, declOp,
/*hasDescriptor=*/false);
} else if (auto recTy = mlir::dyn_cast_if_present<fir::RecordType>(Ty)) {
- return convertRecordType(recTy, fileAttr, scope, declOp);
+ // For nested derived types like shown below, the call sequence of the
+ // convertRecordType will look something like as follows:
+ // convertRecordType (t1)
+ // convertRecordType (t2)
+ // convertRecordType (t3)
+ // We need to recognize when we are processing the top level type like t1
+ // to make caching decision. The variable `derivedTypeDepth` is used for
+ // this purpose and maintains the current depth of derived type processing.
+ // type t1
+ // type(t2), pointer :: p1
+ // end type
+ // type t2
+ // type(t3), pointer :: p2
+ // end type
+ // type t2
+ // integer a
+ // end type
+ derivedTypeDepth++;
+ auto result = convertRecordType(recTy, fileAttr, scope, declOp);
+ derivedTypeDepth--;
+ return result;
} else if (auto tupleTy = mlir::dyn_cast_if_present<mlir::TupleType>(Ty)) {
return convertTupleType(tupleTy, fileAttr, scope, declOp);
} else if (auto refTy = mlir::dyn_cast_if_present<fir::ReferenceType>(Ty)) {
diff --git a/flang/lib/Optimizer/Transforms/DebugTypeGenerator.h b/flang/lib/Optimizer/Transforms/DebugTypeGenerator.h
index 7daa0af166e697..cc4b5428ee1a9e 100644
--- a/flang/lib/Optimizer/Transforms/DebugTypeGenerator.h
+++ b/flang/lib/Optimizer/Transforms/DebugTypeGenerator.h
@@ -91,6 +91,7 @@ class DebugTypeGenerator {
std::uint64_t lenOffset;
std::uint64_t rankOffset;
std::uint64_t rankSize;
+ int32_t derivedTypeDepth;
llvm::DenseMap<mlir::Type, mlir::LLVM::DITypeAttr> typeCache;
};
diff --git a/flang/test/Integration/debug-cyclic-derived-type-3.f90 b/flang/test/Integration/debug-cyclic-derived-type-3.f90
index ef9aed13cc514c..d91c635576e26b 100644
--- a/flang/test/Integration/debug-cyclic-derived-type-3.f90
+++ b/flang/test/Integration/debug-cyclic-derived-type-3.f90
@@ -1,4 +1,4 @@
-! RUN: %flang_fc1 -emit-llvm -debug-info-kind=standalone %s -o -
+! RUN: %flang_fc1 -emit-llvm -debug-info-kind=standalone %s -o - | FileCheck %s
! mainly test that this program does not cause an assertion failure
! testcase for issue 122024
@@ -17,7 +17,7 @@ module m1
program test
use m1
- type(t1),pointer :: foo
+ type(t1),pointer :: foo, foo2
allocate(foo)
allocate(foo%x1)
allocate(foo%x1%x2)
@@ -30,3 +30,7 @@ subroutine sub1(bar)
use m1
type(t2) :: bar
end subroutine
+
+! Test that file compiles ok and there is only one DICompositeType for "t1".
+!CHECK: !DICompositeType(tag: DW_TAG_structure_type, name: "t1"{{.*}})
+!CHECK-NOT: !DICompositeType(tag: DW_TAG_structure_type, name: "t1"{{.*}})
More information about the flang-commits
mailing list