r337456 - [CodeGen] Disable aggressive structor optimizations at -O0, take 3

Pavel Labath via cfe-commits cfe-commits at lists.llvm.org
Mon Jul 30 08:10:23 PDT 2018


I've managed to hack together an implementation of this proposal. I've
put it up at <https://reviews.llvm.org/D49989> so you have an idea of
where I'm going with this. It seems to work fine (I've managed to
bootstrap clang with it), though I'm not exactly proud of the
implementation.

Let me know what you think.

pl
On Mon, 30 Jul 2018 at 12:06, Pavel Labath <labath at google.com> wrote:
>
> Thank you for the explanation Chandler. This patch is not on a
> critical path for me, so I don't mind it taking a while (though it
> would be nice to fix as this is a also long standing cause of
> expression evaluation failures in LLDB).
>
> I believe that best solution here would be extend the set of
> conditions under which we can emit the C1/D1 structor flavour as an
> alias. There is no fundamental reason why these symbols cannot be
> aliases even for linkonce linkage types. In fact, gcc will already
> emit aliases in the situations we resort to separate functions.
>
> The reason I could not do it in this patch (*) is that the emission of
> the specific flavours for linkonce structors happens in a lazy matter
> (i.e. only if they are referenced), but in order to correctly decide
> what kind of emission strategy to choose, one needs to know the full
> set of structors that will be emitted (**). This information is not
> accessible (or at least I couldn't find a way to access it, maybe
> someone with more knowledge of the codebase will) from the place which
> does the decision.
>
> If we are able to do that, then the object file size should be almost
> unaffected (the only difference would be an extra symtab entry), and
> it would also solve another issue, which this patch has inadvertently
> caused: PR38338 - debugger stops twice for some constructor
> breakpoints because of how we generate the line tables. In fact on
> Friday, I was considering reverting this patch myself due to this
> issue, but I eventually choose to leave it in as it did not seem too
> hard to fix it going forward.
>
> Let me know what you think about this idea. I am happy to do some of
> the implementation work here, but I think I will need some guidance as
> to how to collect and plumb the required information.
>
> regards,
> pavel
>
>
> (*) It also didn't seem like it was necessary at the time, as this was
> already strictly better than what we are doing on other platforms,
> which is to emit separate functions unconditionally.
>
> (**) If we don't emit a C1 alias, the C2 constructor needs to go to
> the C2 comdat. If we emit the alias, we need to use the C5 comdat. The
> situation is even more complicated with virtual destructors, as there
> D0, D1 and D2 need to go to the same D5 comdat (and D1 can be an
> alias). If D0 is missing/not emitted, then D2 needs to go to the D2
> comdat, and D1 cannot be an alias.
>
>
>
>
>
>
> On Sun, 29 Jul 2018 at 03:15, Chandler Carruth <chandlerc at gmail.com> wrote:
> >
> > On Sat, Jul 28, 2018 at 2:26 AM Chandler Carruth <chandlerc at gmail.com> wrote:
> >>
> >> On Thu, Jul 19, 2018 at 7:10 AM Pavel Labath via cfe-commits <cfe-commits at lists.llvm.org> wrote:
> >>>
> >>> Author: labath
> >>> Date: Thu Jul 19 07:05:22 2018
> >>> New Revision: 337456
> >>>
> >>> URL: http://llvm.org/viewvc/llvm-project?rev=337456&view=rev
> >>> Log:
> >>> [CodeGen] Disable aggressive structor optimizations at -O0, take 3
> >>
> >>
> >> I'm really sorry to do this, but I need to revert this patch.... Let me try to explain....
> >>
> >> We're seeing at least two issues with it that are causing *serious* issues for our users, and due to numerous other things, we really need to get top-of-tree into usable shape sooner rather than later. As this is improving a long-standing deficiency in Clang, I think it is reasonable to revert temporarily while we sort it out. I'm CC-ing Richard Smith and Eric Christopher directly as I'm going to ask them to make sure we get satisfactory answers to why this patch causes us so many problems and how we can make progress here.
> >>
> >> I don't have a test case at the moment, and I want to *very clearly* call out that it is on us to find a test case or otherwise clearly explain what problem this patch causes and how it can be addressed, or else this patch should be re-landed.
> >>
> >> To at least give some idea of what is going wrong here......
> >>
> >> First, this patch does increase object code size. This isn't really unexpected based on the nature of the patch, and it does so ostensibly in order to gain material fidelity improvements to debug information. Despite the fact that the increased object size causes us problems (it made a few hundered of our binaries' inputs too large to fit into the quota for the input files to our internal distributed build system) we tried to soldier onward...
> >>
> >> But it also causes the Gold linker at least to use considerably more memory than it used to. This has resulted in over 400 failures to link executables due to running out of available memory on the system.
> >>
> >> There are a number of possible causes for both the input size issues and the linker memory issue:
> >> - An unexpected side-effect of this change causes lots of redundant sections to be output with -ffunction-sections, all of which merge away into nothing, but this takes huge amounts of object file and linker resources.
> >> - The object file size increease is expected and unavoidable, but there is some bug in the linker being triggered?
> >> - Something else
> >
> >
> > I wanted to at least do some initial investigation before reverting and I've done that now.
> >
> > What I'm seeing is a 10% or higher increase in the number of sections and the size spent on ELF headers. This doesn't seem *too* surprising. What is surprising is that this seems to cause an even larger increase in memory usage by gold.
> >
> > Anyways, we'll have to debug this more to truly understand what the issue is and whether there is something we can do with this patch that would improve things (such as maybe using fewer sections?) or whether this all just needs to be guarded under a flag.
> >
> > I'm going to revert for now, but hopefully first thing Monday, Richard and Eric can help out with suggestions on how to make progress here.
> >
> > Again, really sorry about this, especially on the third iteration....
> > -Chandler
> >
> >>
> >>
> >> It is going to take us some time to investigate these issues, and sadly, all of these issues are breaking builds actively for us.
> >>
> >> So while we will work very hard on getting to a resolution, if you also really need this functionality to land sooner rather than later, what I would request is to add it back under a flag. In fact, if you need that and request it, we can do the actual work of adding it back under a flag. I think that's the least we can do here.
> >>
> >> Anyways, again, sorry to revert with relatively little warning and with relatively little info.
> >>
> >> Feel free to reach out to myself or Richard Smith, or Eric Christopher who will all be looking at this first thing next week to understand exactly what is going on here and what the root causes is and whether any of these issues are due to bugs in some tool or another, or whether these are all due to some particular issue with our source code.
> >>
> >> -Chandler
> >>
> >>>
> >>> The previous version of this patch (r332839) was reverted because it was
> >>> causing "definition with same mangled name as another definition" errors
> >>> in some module builds. This was caused by an unrelated bug in module
> >>> importing which it exposed. The importing problem was fixed in r336240,
> >>> so this recommits the original patch (r332839).
> >>>
> >>> Differential Revision: https://reviews.llvm.org/D46685
> >>>
> >>> Modified:
> >>>     cfe/trunk/lib/CodeGen/ItaniumCXXABI.cpp
> >>>     cfe/trunk/test/CodeGenCXX/ctor-dtor-alias.cpp
> >>>     cfe/trunk/test/CodeGenCXX/float16-declarations.cpp
> >>>
> >>> Modified: cfe/trunk/lib/CodeGen/ItaniumCXXABI.cpp
> >>> URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/CodeGen/ItaniumCXXABI.cpp?rev=337456&r1=337455&r2=337456&view=diff
> >>> ==============================================================================
> >>> --- cfe/trunk/lib/CodeGen/ItaniumCXXABI.cpp (original)
> >>> +++ cfe/trunk/lib/CodeGen/ItaniumCXXABI.cpp Thu Jul 19 07:05:22 2018
> >>> @@ -3737,12 +3737,22 @@ static StructorCodegen getCodegenToUse(C
> >>>    }
> >>>    llvm::GlobalValue::LinkageTypes Linkage = CGM.getFunctionLinkage(AliasDecl);
> >>>
> >>> -  if (llvm::GlobalValue::isDiscardableIfUnused(Linkage))
> >>> -    return StructorCodegen::RAUW;
> >>> +  // All discardable structors can be RAUWed, but we don't want to do that in
> >>> +  // unoptimized code, as that makes complete structor symbol disappear
> >>> +  // completely, which degrades debugging experience.
> >>> +  // Symbols with private linkage can be safely aliased, so we special case them
> >>> +  // here.
> >>> +  if (llvm::GlobalValue::isLocalLinkage(Linkage))
> >>> +    return CGM.getCodeGenOpts().OptimizationLevel > 0 ? StructorCodegen::RAUW
> >>> +                                                      : StructorCodegen::Alias;
> >>>
> >>> +  // Linkonce structors cannot be aliased nor placed in a comdat, so these need
> >>> +  // to be emitted separately.
> >>>    // FIXME: Should we allow available_externally aliases?
> >>> -  if (!llvm::GlobalAlias::isValidLinkage(Linkage))
> >>> -    return StructorCodegen::RAUW;
> >>> +  if (llvm::GlobalValue::isDiscardableIfUnused(Linkage) ||
> >>> +      !llvm::GlobalAlias::isValidLinkage(Linkage))
> >>> +    return CGM.getCodeGenOpts().OptimizationLevel > 0 ? StructorCodegen::RAUW
> >>> +                                                      : StructorCodegen::Emit;
> >>>
> >>>    if (llvm::GlobalValue::isWeakForLinker(Linkage)) {
> >>>      // Only ELF and wasm support COMDATs with arbitrary names (C5/D5).
> >>>
> >>> Modified: cfe/trunk/test/CodeGenCXX/ctor-dtor-alias.cpp
> >>> URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/test/CodeGenCXX/ctor-dtor-alias.cpp?rev=337456&r1=337455&r2=337456&view=diff
> >>> ==============================================================================
> >>> --- cfe/trunk/test/CodeGenCXX/ctor-dtor-alias.cpp (original)
> >>> +++ cfe/trunk/test/CodeGenCXX/ctor-dtor-alias.cpp Thu Jul 19 07:05:22 2018
> >>> @@ -1,5 +1,7 @@
> >>> -// RUN: %clang_cc1 %s -triple i686-linux -emit-llvm -o - -mconstructor-aliases | FileCheck --check-prefix=NOOPT %s
> >>> -
> >>> +// RUN: %clang_cc1 %s -triple i686-linux -emit-llvm -o - -mconstructor-aliases > %t
> >>> +// RUN: FileCheck --check-prefix=NOOPT1 --input-file=%t %s
> >>> +// RUN: FileCheck --check-prefix=NOOPT2 --input-file=%t %s
> >>> +// RUN: FileCheck --check-prefix=NOOPT3 --input-file=%t %s
> >>>  // RUN: %clang_cc1 %s -triple i686-linux -emit-llvm -o - -mconstructor-aliases -O1 -disable-llvm-passes > %t
> >>>  // RUN: FileCheck --check-prefix=CHECK1 --input-file=%t %s
> >>>  // RUN: FileCheck --check-prefix=CHECK2 --input-file=%t %s
> >>> @@ -21,6 +23,13 @@ namespace test1 {
> >>>  // CHECK1: define weak_odr void @_ZN5test16foobarIvED0Ev({{.*}} comdat($_ZN5test16foobarIvED5Ev)
> >>>  // CHECK1-NOT: comdat
> >>>
> >>> +// This should happen regardless of the opt level.
> >>> +// NOOPT1: @_ZN5test16foobarIvEC1Ev = weak_odr unnamed_addr alias void {{.*}} @_ZN5test16foobarIvEC2Ev
> >>> +// NOOPT1: @_ZN5test16foobarIvED1Ev = weak_odr unnamed_addr alias void (%"struct.test1::foobar"*), void (%"struct.test1::foobar"*)* @_ZN5test16foobarIvED2Ev
> >>> +// NOOPT1: define weak_odr void @_ZN5test16foobarIvEC2Ev({{.*}} comdat($_ZN5test16foobarIvEC5Ev)
> >>> +// NOOPT1: define weak_odr void @_ZN5test16foobarIvED2Ev({{.*}} comdat($_ZN5test16foobarIvED5Ev)
> >>> +// NOOPT1: define weak_odr void @_ZN5test16foobarIvED0Ev({{.*}} comdat($_ZN5test16foobarIvED5Ev)
> >>> +
> >>>  // COFF doesn't support comdats with arbitrary names (C5/D5).
> >>>  // COFF: define weak_odr {{.*}} void @_ZN5test16foobarIvEC2Ev({{.*}} comdat align
> >>>  // COFF: define weak_odr {{.*}} void @_ZN5test16foobarIvEC1Ev({{.*}} comdat align
> >>> @@ -37,12 +46,17 @@ template struct foobar<void>;
> >>>  }
> >>>
> >>>  namespace test2 {
> >>> -// test that when the destrucor is linkonce_odr we just replace every use of
> >>> +// test that when the destructor is linkonce_odr we just replace every use of
> >>>  // C1 with C2.
> >>>
> >>>  // CHECK1: define internal void @__cxx_global_var_init()
> >>>  // CHECK1: call void @_ZN5test26foobarIvEC2Ev
> >>>  // CHECK1: define linkonce_odr void @_ZN5test26foobarIvEC2Ev({{.*}} comdat align
> >>> +
> >>> +// At -O0, we should still emit the complete constructor.
> >>> +// NOOPT1: define internal void @__cxx_global_var_init()
> >>> +// NOOPT1: call void @_ZN5test26foobarIvEC1Ev
> >>> +// NOOPT1: define linkonce_odr void @_ZN5test26foobarIvEC1Ev({{.*}} comdat align
> >>>  void g();
> >>>  template <typename T> struct foobar {
> >>>    foobar() { g(); }
> >>> @@ -57,6 +71,11 @@ namespace test3 {
> >>>  // CHECK1: define internal void @__cxx_global_var_init.1()
> >>>  // CHECK1: call i32 @__cxa_atexit{{.*}}_ZN5test312_GLOBAL__N_11AD2Ev
> >>>  // CHECK1: define internal void @_ZN5test312_GLOBAL__N_11AD2Ev(
> >>> +
> >>> +// We can use an alias for internal symbols at -O0.
> >>> +// NOOPT2: _ZN5test312_GLOBAL__N_11BD1Ev = internal unnamed_addr alias void {{.*}} @_ZN5test312_GLOBAL__N_11BD2Ev
> >>> +// NOOPT2: define internal void @__cxx_global_var_init.1()
> >>> +// NOOPT2: call i32 @__cxa_atexit{{.*}}_ZN5test312_GLOBAL__N_11BD1Ev
> >>>  namespace {
> >>>  struct A {
> >>>    ~A() {}
> >>> @@ -77,11 +96,12 @@ namespace test4 {
> >>>    // CHECK1: call i32 @__cxa_atexit{{.*}}_ZN5test41AD2Ev
> >>>    // CHECK1: define linkonce_odr void @_ZN5test41AD2Ev({{.*}} comdat align
> >>>
> >>> -  // test that we don't do this optimization at -O0 so that the debugger can
> >>> -  // see both destructors.
> >>> -  // NOOPT: define internal void @__cxx_global_var_init.2()
> >>> -  // NOOPT: call i32 @__cxa_atexit{{.*}}@_ZN5test41BD2Ev
> >>> -  // NOOPT: define linkonce_odr void @_ZN5test41BD2Ev({{.*}} comdat align
> >>> +  // Test that we don't do this optimization at -O0 and call the complete
> >>> +  // destructor for B instead. This enables the debugger to see both
> >>> +  // destructors.
> >>> +  // NOOPT2: define internal void @__cxx_global_var_init.2()
> >>> +  // NOOPT2: call i32 @__cxa_atexit{{.*}}@_ZN5test41BD1Ev
> >>> +  // NOOPT2: define linkonce_odr void @_ZN5test41BD1Ev({{.*}} comdat align
> >>>    struct A {
> >>>      virtual ~A() {}
> >>>    };
> >>> @@ -129,6 +149,11 @@ namespace test7 {
> >>>    // out if we should).
> >>>    // pr17875.
> >>>    // CHECK3: define void @_ZN5test71BD2Ev
> >>> +
> >>> +  // At -O0, we should emit both destructors, the complete can be an alias to
> >>> +  // the base one.
> >>> +  // NOOPT3: @_ZN5test71BD1Ev = unnamed_addr alias void {{.*}} @_ZN5test71BD2Ev
> >>> +  // NOOPT3: define void @_ZN5test71BD2Ev
> >>>    template <typename> struct A {
> >>>      ~A() {}
> >>>    };
> >>>
> >>> Modified: cfe/trunk/test/CodeGenCXX/float16-declarations.cpp
> >>> URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/test/CodeGenCXX/float16-declarations.cpp?rev=337456&r1=337455&r2=337456&view=diff
> >>> ==============================================================================
> >>> --- cfe/trunk/test/CodeGenCXX/float16-declarations.cpp (original)
> >>> +++ cfe/trunk/test/CodeGenCXX/float16-declarations.cpp Thu Jul 19 07:05:22 2018
> >>> @@ -103,7 +103,7 @@ int main(void) {
> >>>
> >>>    C1 c1(f1l);
> >>>  // CHECK-DAG:  [[F1L:%[a-z0-9]+]] = load half, half* %{{.*}}, align 2
> >>> -// CHECK-DAG:  call void @_ZN2C1C2EDF16_(%class.C1* %{{.*}}, half %{{.*}})
> >>> +// CHECK-DAG:  call void @_ZN2C1C1EDF16_(%class.C1* %{{.*}}, half %{{.*}})
> >>>
> >>>    S1<_Float16> s1 = { 132.f16 };
> >>>  // CHECK-DAG: @_ZZ4mainE2s1 = private unnamed_addr constant %struct.S1 { half 0xH5820 }, align 2
> >>>
> >>>
> >>> _______________________________________________
> >>> cfe-commits mailing list
> >>> cfe-commits at lists.llvm.org
> >>> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


More information about the cfe-commits mailing list