[PATCH] D11908: Clang support for -fthinlto.

Teresa Johnson via cfe-commits cfe-commits at lists.llvm.org
Wed Sep 30 10:40:48 PDT 2015


On Wed, Sep 30, 2015 at 10:19 AM, Duncan P. N. Exon Smith
<dexonsmith at apple.com> wrote:
>
>> On 2015-Sep-23, at 10:28, Teresa Johnson <tejohnson at google.com> wrote:
>>
>> tejohnson updated this revision to Diff 35527.
>> tejohnson added a comment.
>>
>> Updated the patch for the new LLVM/gold patch (http://reviews.llvm.org/D13107).
>>
>> Two changes:
>>
>> - I put back the original change to pass a new plugin option to gold. Since the function index design/name has been generalized to not be ThinLTO-specific, it makes sense to have the ThinLTO-related gold behavior for modules with those sections to be dependent on an option.
>> - Added 2 tests.
>>
>>
>> http://reviews.llvm.org/D11908
>>
>> Files:
>>  include/clang/Driver/Options.td
>>  include/clang/Frontend/CodeGenOptions.def
>>  lib/CodeGen/BackendUtil.cpp
>>  lib/Driver/Driver.cpp
>>  lib/Driver/Tools.cpp
>>  lib/Frontend/CompilerInvocation.cpp
>>  test/Driver/thinlto.c
>>  test/Misc/thinlto.c
>>
>> <D11908.35527.patch>
>
>> Index: test/Misc/thinlto.c
>> ===================================================================
>> --- /dev/null
>> +++ test/Misc/thinlto.c
>> @@ -0,0 +1,22 @@
>> +// RUN: %clang_cc1 -fthinlto -emit-llvm-bc < %s | llvm-bcanalyzer -dump | FileCheck %s
>> +// CHECK: <FUNCTION_SUMMARY_BLOCK
>> +// CHECK-NEXT: <PERMODULE_ENTRY
>> +// CHECK-NEXT: <PERMODULE_ENTRY
>> +// CHECK-NEXT: </FUNCTION_SUMMARY_BLOCK
>> +
>> +// RUN: %clang -fthinlto %s -o %t -fuse-ld=gold
>
> Why isn't this using -cc1?

I can try it again, but I believe -fuse-ld is not a -cc1 option.

>
>> +// RUN: llvm-bcanalyzer -dump %t.thinlto.bc | FileCheck %s --check-prefix=COMBINED
>> +// COMBINED: <MODULE_STRTAB_BLOCK
>> +// COMBINED-NEXT: <ENTRY
>> +// COMBINED-NEXT: </MODULE_STRTAB_BLOCK
>> +// COMBINED-NEXT: <FUNCTION_SUMMARY_BLOCK
>> +// COMBINED-NEXT: <COMBINED_ENTRY
>> +// COMBINED-NEXT: <COMBINED_ENTRY
>> +// COMBINED-NEXT: </FUNCTION_SUMMARY_BLOCK
>> +
>> +__attribute__((noinline)) void foo() {
>> +}
>> +
>> +int main() {
>> +  foo();
>> +}
>> Index: test/Driver/thinlto.c
>> ===================================================================
>> --- /dev/null
>> +++ test/Driver/thinlto.c
>> @@ -0,0 +1,11 @@
>> +// -fthinlto causes a switch to llvm-bc object files.
>> +// RUN: %clang -ccc-print-phases -c %s -fthinlto 2> %t.log
>> +// RUN: grep '2: compiler, {1}, ir' %t.log
>> +// RUN: grep '3: backend, {2}, lto-bc' %t.log
>> +
>> +// RUN: %clang -ccc-print-phases %s -fthinlto 2> %t.log
>> +// RUN: grep '0: input, ".*thinlto.c", c' %t.log
>> +// RUN: grep '1: preprocessor, {0}, cpp-output' %t.log
>> +// RUN: grep '2: compiler, {1}, ir' %t.log
>> +// RUN: grep '3: backend, {2}, lto-bc' %t.log
>> +// RUN: grep '4: linker, {3}, image' %t.log
>
> Usually we prefer FileCheck tests over grep (typically the grep tests
> are just old and haven't been converted yet).  Is there a particular
> reason you're using grep, or were you just copying an example?

This was cloned from test/Driver/lto.c. Let me see if I can rework
this with FileCheck.

>
>> Index: lib/Frontend/CompilerInvocation.cpp
>> ===================================================================
>> --- lib/Frontend/CompilerInvocation.cpp
>> +++ lib/Frontend/CompilerInvocation.cpp
>> @@ -518,6 +518,7 @@
>>    Opts.MergeFunctions = Args.hasArg(OPT_fmerge_functions);
>>
>>    Opts.PrepareForLTO = Args.hasArg(OPT_flto);
>> +  Opts.EmitThinLTOIndex = Args.hasArg(OPT_fthinlto);
>>
>>    Opts.MSVolatile = Args.hasArg(OPT_fms_volatile);
>>
>> Index: lib/Driver/Tools.cpp
>> ===================================================================
>> --- lib/Driver/Tools.cpp
>> +++ lib/Driver/Tools.cpp
>> @@ -1664,6 +1664,9 @@
>>    std::string CPU = getCPUName(Args, ToolChain.getTriple());
>>    if (!CPU.empty())
>>      CmdArgs.push_back(Args.MakeArgString(Twine("-plugin-opt=mcpu=") + CPU));
>> +
>> +  if (Args.hasArg(options::OPT_fthinlto))
>> +    CmdArgs.push_back("-plugin-opt=thinlto");
>>  }
>>
>>  /// This is a helper function for validating the optional refinement step
>> @@ -2818,6 +2821,8 @@
>>    // preprocessing, precompiling or assembling.
>>    Args.ClaimAllArgs(options::OPT_flto);
>>    Args.ClaimAllArgs(options::OPT_fno_lto);
>> +  Args.ClaimAllArgs(options::OPT_fthinlto);
>> +  Args.ClaimAllArgs(options::OPT_fno_thinlto);
>>  }
>>
>>  static void appendUserToPath(SmallVectorImpl<char> &Result) {
>> @@ -3236,6 +3241,8 @@
>>             "Invalid action for clang tool.");
>>
>>      if (JA.getType() == types::TY_LTO_IR || JA.getType() == types::TY_LTO_BC) {
>> +      // Enables PrepareForLTO, which we want for -fthinlto as well.
>> +      // ThinLTO also uses the TY_LTO_* types.
>>        CmdArgs.push_back("-flto");
>>      }
>>      if (JA.getType() == types::TY_Nothing) {
>> @@ -3268,6 +3275,10 @@
>>      // the use-list order, since serialization to bitcode is part of the flow.
>>      if (JA.getType() == types::TY_LLVM_BC)
>>        CmdArgs.push_back("-emit-llvm-uselists");
>> +
>> +    if (Args.hasArg(options::OPT_fthinlto)) {
>> +      CmdArgs.push_back("-fthinlto");
>> +    }
>>    }
>>
>>    // We normally speed up the clang process a bit by skipping destructors at
>> Index: lib/Driver/Driver.cpp
>> ===================================================================
>> --- lib/Driver/Driver.cpp
>> +++ lib/Driver/Driver.cpp
>> @@ -1576,7 +1576,8 @@
>>  }
>>
>>  bool Driver::IsUsingLTO(const ArgList &Args) const {
>> -  return Args.hasFlag(options::OPT_flto, options::OPT_fno_lto, false);
>> +  return Args.hasFlag(options::OPT_flto, options::OPT_fno_lto, false) ||
>> +      Args.hasFlag(options::OPT_fthinlto, options::OPT_fno_thinlto, false);
>>  }
>>
>>  void Driver::BuildJobs(Compilation &C) const {
>> Index: lib/CodeGen/BackendUtil.cpp
>> ===================================================================
>> --- lib/CodeGen/BackendUtil.cpp
>> +++ lib/CodeGen/BackendUtil.cpp
>> @@ -268,6 +268,7 @@
>>      return;
>>
>>    unsigned OptLevel = CodeGenOpts.OptimizationLevel;
>> +
>
> This whitespace change seems unrelated.

Will remove.

>
>>    CodeGenOptions::InliningMethod Inlining = CodeGenOpts.getInlining();
>>
>>    // Handle disabling of LLVM optimization, where we want to preserve the
>> @@ -610,7 +611,8 @@
>>
>>    case Backend_EmitBC:
>>      getPerModulePasses()->add(
>> -        createBitcodeWriterPass(*OS, CodeGenOpts.EmitLLVMUseLists));
>> +        createBitcodeWriterPass(*OS, CodeGenOpts.EmitLLVMUseLists,
>> +                                CodeGenOpts.EmitThinLTOIndex));
>>      break;
>>
>>    case Backend_EmitLL:
>> Index: include/clang/Frontend/CodeGenOptions.def
>> ===================================================================
>> --- include/clang/Frontend/CodeGenOptions.def
>> +++ include/clang/Frontend/CodeGenOptions.def
>> @@ -72,7 +72,9 @@
>>  CODEGENOPT(InstrumentForProfiling , 1, 0) ///< Set when -pg is enabled.
>>  CODEGENOPT(LessPreciseFPMAD  , 1, 0) ///< Enable less precise MAD instructions to
>>                                       ///< be generated.
>> -CODEGENOPT(PrepareForLTO     , 1, 0) ///< Set when -flto is enabled on the
>> +CODEGENOPT(PrepareForLTO     , 1, 0) ///< Set when -flto or -fthinlto is enabled
>> +                                     ///< on the compile step.
>> +CODEGENOPT(EmitThinLTOIndex  , 1, 0) ///< Set when -fthinlto is enabled on the
>>                                       ///< compile step.
>>  CODEGENOPT(MergeAllConstants , 1, 1) ///< Merge identical constants.
>>  CODEGENOPT(MergeFunctions    , 1, 0) ///< Set when -fmerge-functions is enabled.
>> Index: include/clang/Driver/Options.td
>> ===================================================================
>> --- include/clang/Driver/Options.td
>> +++ include/clang/Driver/Options.td
>> @@ -686,6 +686,9 @@
>>  def flto_EQ : Joined<["-"], "flto=">, Group<clang_ignored_gcc_optimization_f_Group>;
>>  def flto : Flag<["-"], "flto">, Flags<[CC1Option]>, Group<f_Group>;
>>  def fno_lto : Flag<["-"], "fno-lto">, Group<f_Group>;
>> +def fthinlto : Flag<["-"], "fthinlto">, Flags<[CC1Option]>, Group<f_Group>;
>
> I wonder whether this is the right way to add new variants of LTO.
> I.e., maybe this should be "-flto=thin"?  Do you happen to know how this
> would conflict with GCC options?  (I see we ignore "-flto="...)

I looked at GCC docs and sources. It does have a -flto= variant. From
https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html:
-----------------
If you specify the optional n, the optimization and code generation
done at link time is executed in parallel using n parallel jobs by
utilizing an installed make program. The environment variable MAKE may
be used to override the program used. The default value for n is 1.

You can also specify -flto=jobserver to use GNU make's job server mode
to determine the number of parallel jobs. This is useful when the
Makefile calling GCC is already executing in parallel. You must
prepend a ‘+’ to the command recipe in the parent Makefile for this to
work. This option likely only works if MAKE is GNU make.
----------------

I would anticipate that we may want to support something like this for
ThinLTO eventually to specify the number of parallel jobs for the
ThinLTO backend processes. So it might be confusing to overload
-flto=.

>
> E.g., as a user I'd expect -fno-lto to turn off whatever LTO was turned
> on.  And I might expect -flto to override -fthinlto, or vice versa.

Good point, I think the last one should win. Looks like right now
-fthinlto is effectively always winning. I will fix that.

Not sure about -fno-lto behavior if we leave -fthinlto as a separate
option. Let me know what you think based on the GCC usage of -flto=.

>
>> +def fthinlto_be : Flag<["-"], "fthinlto-be">, Flags<[CC1Option]>, Group<f_Group>;
>
> What's -fthinlto-be for?  It seems to be unused/untested.

That snuck in from some follow-on patches I am working on (invoking
the LTO pipeline + TBD importing pass in the BE processes). Will
remove from here.

>
>> +def fno_thinlto : Flag<["-"], "fno-thinlto">, Group<f_Group>;
>>  def fmacro_backtrace_limit_EQ : Joined<["-"], "fmacro-backtrace-limit=">,
>>                                  Group<f_Group>, Flags<[DriverOption, CoreOption]>;
>>  def fmerge_all_constants : Flag<["-"], "fmerge-all-constants">, Group<f_Group>;
>>
>



-- 
Teresa Johnson | Software Engineer | tejohnson at google.com | 408-460-2413


More information about the cfe-commits mailing list