[PATCH] D11908: Clang support for -fthinlto.

Teresa Johnson via cfe-commits cfe-commits at lists.llvm.org
Wed Sep 30 12:39:26 PDT 2015


On Wed, Sep 30, 2015 at 11:11 AM, Duncan P. N. Exon Smith
<dexonsmith at apple.com> wrote:
>
>> On 2015-Sep-30, at 10:40, Teresa Johnson <tejohnson at google.com> wrote:
>>
>> On Wed, Sep 30, 2015 at 10:19 AM, Duncan P. N. Exon Smith
>> <dexonsmith at apple.com> wrote:
>>>
>>>> On 2015-Sep-23, at 10:28, Teresa Johnson <tejohnson at google.com> wrote:
>>>>
>>>> tejohnson updated this revision to Diff 35527.
>>>> tejohnson added a comment.
>>>>
>>>> Updated the patch for the new LLVM/gold patch (http://reviews.llvm.org/D13107).
>>>>
>>>> Two changes:
>>>>
>>>> - I put back the original change to pass a new plugin option to gold. Since the function index design/name has been generalized to not be ThinLTO-specific, it makes sense to have the ThinLTO-related gold behavior for modules with those sections to be dependent on an option.
>>>> - Added 2 tests.
>>>>
>>>>
>>>> http://reviews.llvm.org/D11908
>>>>
>>>> Files:
>>>> include/clang/Driver/Options.td
>>>> include/clang/Frontend/CodeGenOptions.def
>>>> lib/CodeGen/BackendUtil.cpp
>>>> lib/Driver/Driver.cpp
>>>> lib/Driver/Tools.cpp
>>>> lib/Frontend/CompilerInvocation.cpp
>>>> test/Driver/thinlto.c
>>>> test/Misc/thinlto.c
>>>>
>>>> <D11908.35527.patch>
>>>
>>>> Index: test/Misc/thinlto.c
>>>> ===================================================================
>>>> --- /dev/null
>>>> +++ test/Misc/thinlto.c
>>>> @@ -0,0 +1,22 @@
>>>> +// RUN: %clang_cc1 -fthinlto -emit-llvm-bc < %s | llvm-bcanalyzer -dump | FileCheck %s
>>>> +// CHECK: <FUNCTION_SUMMARY_BLOCK
>>>> +// CHECK-NEXT: <PERMODULE_ENTRY
>>>> +// CHECK-NEXT: <PERMODULE_ENTRY
>>>> +// CHECK-NEXT: </FUNCTION_SUMMARY_BLOCK
>>>> +
>>>> +// RUN: %clang -fthinlto %s -o %t -fuse-ld=gold
>>>
>>> Why isn't this using -cc1?
>>
>> I can try it again, but I believe -fuse-ld is not a -cc1 option.
>
> Oh, right, of course.  -cc1 doesn't invoke the linker, the driver does.
>
> In that case, it seems to me like you should:
> 1. have a driver test that uses -### to check that you're getting the
> expected linker command-line (in test/Driver), and

Added that to my new Driver test when I was changing it to switch the
greps to FileCheck.

> 2. somewhere (probably in test/tools/gold in LLVM?) confirm that gold
> does the right thing with the thinlto plugin.

There's already one in the companion patch D13107. So I will go ahead
and remove this part (I'll go ahead and leave in the first part of
test/Misc/thinlto.c above).

>
>
>
>>>> +// RUN: llvm-bcanalyzer -dump %t.thinlto.bc | FileCheck %s --check-prefix=COMBINED
>>>> +// COMBINED: <MODULE_STRTAB_BLOCK
>>>> +// COMBINED-NEXT: <ENTRY
>>>> +// COMBINED-NEXT: </MODULE_STRTAB_BLOCK
>>>> +// COMBINED-NEXT: <FUNCTION_SUMMARY_BLOCK
>>>> +// COMBINED-NEXT: <COMBINED_ENTRY
>>>> +// COMBINED-NEXT: <COMBINED_ENTRY
>>>> +// COMBINED-NEXT: </FUNCTION_SUMMARY_BLOCK
>>>> +
>>>> +__attribute__((noinline)) void foo() {
>>>> +}
>>>> +
>>>> +int main() {
>>>> +  foo();
>>>> +}
>>>> Index: test/Driver/thinlto.c
>>>> ===================================================================
>>>> --- /dev/null
>>>> +++ test/Driver/thinlto.c
>>>> @@ -0,0 +1,11 @@
>>>> +// -fthinlto causes a switch to llvm-bc object files.
>>>> +// RUN: %clang -ccc-print-phases -c %s -fthinlto 2> %t.log
>>>> +// RUN: grep '2: compiler, {1}, ir' %t.log
>>>> +// RUN: grep '3: backend, {2}, lto-bc' %t.log
>>>> +
>>>> +// RUN: %clang -ccc-print-phases %s -fthinlto 2> %t.log
>>>> +// RUN: grep '0: input, ".*thinlto.c", c' %t.log
>>>> +// RUN: grep '1: preprocessor, {0}, cpp-output' %t.log
>>>> +// RUN: grep '2: compiler, {1}, ir' %t.log
>>>> +// RUN: grep '3: backend, {2}, lto-bc' %t.log
>>>> +// RUN: grep '4: linker, {3}, image' %t.log
>>>
>>> Usually we prefer FileCheck tests over grep (typically the grep tests
>>> are just old and haven't been converted yet).  Is there a particular
>>> reason you're using grep, or were you just copying an example?
>>
>> This was cloned from test/Driver/lto.c. Let me see if I can rework
>> this with FileCheck.
>>
>>>
>>>> Index: lib/Frontend/CompilerInvocation.cpp
>>>> ===================================================================
>>>> --- lib/Frontend/CompilerInvocation.cpp
>>>> +++ lib/Frontend/CompilerInvocation.cpp
>>>> @@ -518,6 +518,7 @@
>>>>   Opts.MergeFunctions = Args.hasArg(OPT_fmerge_functions);
>>>>
>>>>   Opts.PrepareForLTO = Args.hasArg(OPT_flto);
>>>> +  Opts.EmitThinLTOIndex = Args.hasArg(OPT_fthinlto);
>>>>
>>>>   Opts.MSVolatile = Args.hasArg(OPT_fms_volatile);
>>>>
>>>> Index: lib/Driver/Tools.cpp
>>>> ===================================================================
>>>> --- lib/Driver/Tools.cpp
>>>> +++ lib/Driver/Tools.cpp
>>>> @@ -1664,6 +1664,9 @@
>>>>   std::string CPU = getCPUName(Args, ToolChain.getTriple());
>>>>   if (!CPU.empty())
>>>>     CmdArgs.push_back(Args.MakeArgString(Twine("-plugin-opt=mcpu=") + CPU));
>>>> +
>>>> +  if (Args.hasArg(options::OPT_fthinlto))
>>>> +    CmdArgs.push_back("-plugin-opt=thinlto");
>>>> }
>>>>
>>>> /// This is a helper function for validating the optional refinement step
>>>> @@ -2818,6 +2821,8 @@
>>>>   // preprocessing, precompiling or assembling.
>>>>   Args.ClaimAllArgs(options::OPT_flto);
>>>>   Args.ClaimAllArgs(options::OPT_fno_lto);
>>>> +  Args.ClaimAllArgs(options::OPT_fthinlto);
>>>> +  Args.ClaimAllArgs(options::OPT_fno_thinlto);
>>>> }
>>>>
>>>> static void appendUserToPath(SmallVectorImpl<char> &Result) {
>>>> @@ -3236,6 +3241,8 @@
>>>>            "Invalid action for clang tool.");
>>>>
>>>>     if (JA.getType() == types::TY_LTO_IR || JA.getType() == types::TY_LTO_BC) {
>>>> +      // Enables PrepareForLTO, which we want for -fthinlto as well.
>>>> +      // ThinLTO also uses the TY_LTO_* types.
>>>>       CmdArgs.push_back("-flto");
>>>>     }
>>>>     if (JA.getType() == types::TY_Nothing) {
>>>> @@ -3268,6 +3275,10 @@
>>>>     // the use-list order, since serialization to bitcode is part of the flow.
>>>>     if (JA.getType() == types::TY_LLVM_BC)
>>>>       CmdArgs.push_back("-emit-llvm-uselists");
>>>> +
>>>> +    if (Args.hasArg(options::OPT_fthinlto)) {
>>>> +      CmdArgs.push_back("-fthinlto");
>>>> +    }
>>>>   }
>>>>
>>>>   // We normally speed up the clang process a bit by skipping destructors at
>>>> Index: lib/Driver/Driver.cpp
>>>> ===================================================================
>>>> --- lib/Driver/Driver.cpp
>>>> +++ lib/Driver/Driver.cpp
>>>> @@ -1576,7 +1576,8 @@
>>>> }
>>>>
>>>> bool Driver::IsUsingLTO(const ArgList &Args) const {
>>>> -  return Args.hasFlag(options::OPT_flto, options::OPT_fno_lto, false);
>>>> +  return Args.hasFlag(options::OPT_flto, options::OPT_fno_lto, false) ||
>>>> +      Args.hasFlag(options::OPT_fthinlto, options::OPT_fno_thinlto, false);
>>>> }
>>>>
>>>> void Driver::BuildJobs(Compilation &C) const {
>>>> Index: lib/CodeGen/BackendUtil.cpp
>>>> ===================================================================
>>>> --- lib/CodeGen/BackendUtil.cpp
>>>> +++ lib/CodeGen/BackendUtil.cpp
>>>> @@ -268,6 +268,7 @@
>>>>     return;
>>>>
>>>>   unsigned OptLevel = CodeGenOpts.OptimizationLevel;
>>>> +
>>>
>>> This whitespace change seems unrelated.
>>
>> Will remove.
>>
>>>
>>>>   CodeGenOptions::InliningMethod Inlining = CodeGenOpts.getInlining();
>>>>
>>>>   // Handle disabling of LLVM optimization, where we want to preserve the
>>>> @@ -610,7 +611,8 @@
>>>>
>>>>   case Backend_EmitBC:
>>>>     getPerModulePasses()->add(
>>>> -        createBitcodeWriterPass(*OS, CodeGenOpts.EmitLLVMUseLists));
>>>> +        createBitcodeWriterPass(*OS, CodeGenOpts.EmitLLVMUseLists,
>>>> +                                CodeGenOpts.EmitThinLTOIndex));
>>>>     break;
>>>>
>>>>   case Backend_EmitLL:
>>>> Index: include/clang/Frontend/CodeGenOptions.def
>>>> ===================================================================
>>>> --- include/clang/Frontend/CodeGenOptions.def
>>>> +++ include/clang/Frontend/CodeGenOptions.def
>>>> @@ -72,7 +72,9 @@
>>>> CODEGENOPT(InstrumentForProfiling , 1, 0) ///< Set when -pg is enabled.
>>>> CODEGENOPT(LessPreciseFPMAD  , 1, 0) ///< Enable less precise MAD instructions to
>>>>                                      ///< be generated.
>>>> -CODEGENOPT(PrepareForLTO     , 1, 0) ///< Set when -flto is enabled on the
>>>> +CODEGENOPT(PrepareForLTO     , 1, 0) ///< Set when -flto or -fthinlto is enabled
>>>> +                                     ///< on the compile step.
>>>> +CODEGENOPT(EmitThinLTOIndex  , 1, 0) ///< Set when -fthinlto is enabled on the
>>>>                                      ///< compile step.
>>>> CODEGENOPT(MergeAllConstants , 1, 1) ///< Merge identical constants.
>>>> CODEGENOPT(MergeFunctions    , 1, 0) ///< Set when -fmerge-functions is enabled.
>>>> Index: include/clang/Driver/Options.td
>>>> ===================================================================
>>>> --- include/clang/Driver/Options.td
>>>> +++ include/clang/Driver/Options.td
>>>> @@ -686,6 +686,9 @@
>>>> def flto_EQ : Joined<["-"], "flto=">, Group<clang_ignored_gcc_optimization_f_Group>;
>>>> def flto : Flag<["-"], "flto">, Flags<[CC1Option]>, Group<f_Group>;
>>>> def fno_lto : Flag<["-"], "fno-lto">, Group<f_Group>;
>>>> +def fthinlto : Flag<["-"], "fthinlto">, Flags<[CC1Option]>, Group<f_Group>;
>>>
>>> I wonder whether this is the right way to add new variants of LTO.
>>> I.e., maybe this should be "-flto=thin"?  Do you happen to know how this
>>> would conflict with GCC options?  (I see we ignore "-flto="...)
>>
>> I looked at GCC docs and sources. It does have a -flto= variant. From
>> https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html:
>> -----------------
>> If you specify the optional n, the optimization and code generation
>> done at link time is executed in parallel using n parallel jobs by
>> utilizing an installed make program. The environment variable MAKE may
>> be used to override the program used. The default value for n is 1.
>>
>> You can also specify -flto=jobserver to use GNU make's job server mode
>> to determine the number of parallel jobs. This is useful when the
>> Makefile calling GCC is already executing in parallel. You must
>> prepend a ‘+’ to the command recipe in the parent Makefile for this to
>> work. This option likely only works if MAKE is GNU make.
>> ----------------
>>
>> I would anticipate that we may want to support something like this for
>> ThinLTO eventually to specify the number of parallel jobs for the
>> ThinLTO backend processes. So it might be confusing to overload
>> -flto=.
>
> Hmm.  You're right that the GCC usage seems pretty different.
>
> I guess you're envisioning -fthinlto=jobserver?

Or -fthinlto=n.

> I wonder if we could
> do this as -flto=thin,jobserver or something similar?

I am ok with -flto=thin and then later extending to -flto=thin,n etc.

This simplifies the interaction with -fno-lto.

I think a -flto=thin followed by -flto should probably revert to
normal LTO, WDYT?

Another thing to consider is to add -flto=full or something like that
which is currently an alias for -flto. So we would have:

-flto=<type> where type could be "full" or "thin"
-flto means -flto=full
-fno-lto disables all types of flto
and last option of the above 3 wins.

>
> It's pretty hard to remove driver options, so I think it's important to
> get the interface right.  I anticipate that we'll add more LTO variants
> in the future, so we should take care that this scales reasonably.
>
> (This may be a good discussion for cfe-dev, not sure.)
>
>>> E.g., as a user I'd expect -fno-lto to turn off whatever LTO was turned
>>> on.  And I might expect -flto to override -fthinlto, or vice versa.
>>
>> Good point, I think the last one should win. Looks like right now
>> -fthinlto is effectively always winning. I will fix that.
>>
>> Not sure about -fno-lto behavior if we leave -fthinlto as a separate
>> option. Let me know what you think based on the GCC usage of -flto=.
>
> Right, if they're totally separate, then either behaviour for -fno-lto
> is confusing.
>
>>>> +def fthinlto_be : Flag<["-"], "fthinlto-be">, Flags<[CC1Option]>, Group<f_Group>;
>>>
>>> What's -fthinlto-be for?  It seems to be unused/untested.
>>
>> That snuck in from some follow-on patches I am working on (invoking
>> the LTO pipeline + TBD importing pass in the BE processes). Will
>> remove from here.
>>
>>>
>>>> +def fno_thinlto : Flag<["-"], "fno-thinlto">, Group<f_Group>;
>>>> def fmacro_backtrace_limit_EQ : Joined<["-"], "fmacro-backtrace-limit=">,
>>>>                                 Group<f_Group>, Flags<[DriverOption, CoreOption]>;
>>>> def fmerge_all_constants : Flag<["-"], "fmerge-all-constants">, Group<f_Group>;
>>>>
>>>
>>
>>
>>
>> --
>> Teresa Johnson | Software Engineer | tejohnson at google.com | 408-460-2413
>



-- 
Teresa Johnson | Software Engineer | tejohnson at google.com | 408-460-2413


More information about the cfe-commits mailing list