[PATCH] D17654: [PGO] Remove duplicate profile entries for functions with available_externally linkage
Xinliang David Li via llvm-commits
llvm-commits at lists.llvm.org
Fri Feb 26 15:01:38 PST 2016
I guess I will remember not to write long scary comments next time ;)
David
On Fri, Feb 26, 2016 at 2:35 PM, David Blaikie <dblaikie at gmail.com> wrote:
> Rightio - thanks for walking me through it. Think I'm
> following/understand. Though the long comment you're adding confuses me in
> a few ways, but I'll write that off as lack of familiarity with this code &
> leave you to it.
>
> (Guess there's no sane way to emit the counter code such that it just
> no-ops if the counter isn't linked in from the external def - that'd seem
> ideal, but I can't think of any linker/linkage tricks that would produce
> that behavior)
>
> (please consider my issues closed & carry on with review from anyone
> else/more domain-specific reviewers)
>
> On Fri, Feb 26, 2016 at 2:22 PM, Xinliang David Li <davidxl at google.com>
> wrote:
>
>>
>>
>> On Fri, Feb 26, 2016 at 1:57 PM, David Blaikie <dblaikie at gmail.com>
>> wrote:
>>
>>>
>>>
>>> On Fri, Feb 26, 2016 at 1:41 PM, Xinliang David Li <davidxl at google.com>
>>> wrote:
>>>
>>>>
>>>>
>>>> On Fri, Feb 26, 2016 at 1:39 PM, Xinliang David Li <davidxl at google.com>
>>>> wrote:
>>>>
>>>>>
>>>>>
>>>>> On Fri, Feb 26, 2016 at 1:36 PM, David Blaikie <dblaikie at gmail.com>
>>>>> wrote:
>>>>>
>>>>>>
>>>>>>
>>>>>> On Fri, Feb 26, 2016 at 1:33 PM, Xinliang David Li <
>>>>>> davidxl at google.com> wrote:
>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Fri, Feb 26, 2016 at 1:27 PM, David Blaikie <dblaikie at gmail.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Fri, Feb 26, 2016 at 1:18 PM, Xinliang David Li <
>>>>>>>> davidxl at google.com> wrote:
>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Fri, Feb 26, 2016 at 1:07 PM, David Blaikie <dblaikie at gmail.com
>>>>>>>>> > wrote:
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Fri, Feb 26, 2016 at 12:32 PM, David Li via llvm-commits <
>>>>>>>>>> llvm-commits at lists.llvm.org> wrote:
>>>>>>>>>>
>>>>>>>>>>> davidxl created this revision.
>>>>>>>>>>> davidxl added a reviewer: vsk.
>>>>>>>>>>> davidxl added a subscriber: llvm-commits.
>>>>>>>>>>>
>>>>>>>>>>> The per function data and counter variables for available
>>>>>>>>>>> externally functions are created with linkonce linkage (to prevent compiler
>>>>>>>>>>> from dropping counter vars). However on ELF based systems, linkonce symbols
>>>>>>>>>>> are created as weak symbols and the duplicated entries are not removed by
>>>>>>>>>>> the linker.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> That ^ seems surprising/strange/confusing/problematic - any idea
>>>>>>>>>> what's going on there? Is that a bug in the compiler? linker? specification
>>>>>>>>>> of some kind?
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> I am not sure what you questions is. Duplicate definitions of weak
>>>>>>>>> symbols are allowed so linker won't discard the duplicate copies. The
>>>>>>>>> symbol reference will be resolved the strong definition by the linker.
>>>>>>>>>
>>>>>>>>
>>>>>>>> Generally, I'm trying to understand why the counter is handled
>>>>>>>> differently from the function itself.
>>>>>>>>
>>>>>>>> The comment mentions "To avoid link errors, profile counters for
>>>>>>>> function with available_externally linkage needs to be changed to
>>>>>>>> linkonce linkage. "
>>>>>>>> Why is that? Can we not rely on the counter to be emitted where the
>>>>>>>> function is emitted? (where the external definition of the function is)
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>> This is a good question. This is a design choice made long ago. For
>>>>>>> function with regular external linkage, the profile counter's linkage is
>>>>>>> set to 'internal', possibly to reduce the size overhead (by eliminating
>>>>>>> symbol table entries). Because of this, the available_external function's
>>>>>>> profile counter will no longer have a backup definition to be resolved to
>>>>>>> so their counter's linkage gets changed to linkonce.
>>>>>>>
>>>>>>
>>>>>> It seems like the compiler's in control of available_externally
>>>>>> definitions, would it be possible that it would be in control of not
>>>>>> internalizing counters it knows it may use available_externally elsewhere?
>>>>>> (ie: when emitting the external definition of a thing that the compiler may
>>>>>> emit available_externally definitions for, do not internalize the counter?)
>>>>>>
>>>>>
>>>>> I am not aware of a way to check this condition -- if there is one I
>>>>> would certainly like to use it :)
>>>>>
>>>>
>>> I believe it's knowable - there may not be a single function that works
>>> as a perfect predicate for this now, but all I'm saying is: we/the compiler
>>> knows which entities we emit as available_externally, so we can know this
>>> when emitting the external definition too.
>>>
>>
>> I guess FE needs to pass that down to IR in some way.
>>
>>
>>>
>>>>
>>>> Actually there is another complicacy here -- user may decide not to
>>>> instrument the module with the strong definition -- thus this can not
>>>> really be depended upon.
>>>>
>>>
>>> Do we support this, generally? (what do we deal with inline functions
>>> and their associated counter?)
>>>
>>
>> Yes, we do support this. We have customers with legit reason that some
>> modules can not be instrumented. What is lost is only the counter values
>> for the out of line copy.
>>
>> A more general case is actually that the available_externally function is
>> strongly defined in a standard library (libc++.so) for instance. The
>> example I mentioned in the patch is such a case.
>>
>> David
>>
>>>
>>>
>>>>
>>>> David
>>>>
>>>>
>>>>>
>>>>> David
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>>
>>>>>>
>>>>>>>
>>>>>>> David
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>>
>>>>>>>>> David
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>> For bad consequences it causes, see comments in the code. One
>>>>>>>>>>> example, the profile counts for _ZNKSs7_M_dataEv method is duplicated 655
>>>>>>>>>>> times in raw profile data of clang. In the merged indexed profile, the
>>>>>>>>>>> counter values are magnified 655x -- totally dwarfed other functions --
>>>>>>>>>>> this also distorted profile summary a lot leading to not useful profile
>>>>>>>>>>> data.
>>>>>>>>>>>
>>>>>>>>>>> http://reviews.llvm.org/D17654
>>>>>>>>>>>
>>>>>>>>>>> Files:
>>>>>>>>>>> lib/Transforms/Instrumentation/InstrProfiling.cpp
>>>>>>>>>>> test/Instrumentation/InstrProfiling/linkage.ll
>>>>>>>>>>>
>>>>>>>>>>> Index: test/Instrumentation/InstrProfiling/linkage.ll
>>>>>>>>>>>
>>>>>>>>>>> ===================================================================
>>>>>>>>>>> --- test/Instrumentation/InstrProfiling/linkage.ll
>>>>>>>>>>> +++ test/Instrumentation/InstrProfiling/linkage.ll
>>>>>>>>>>> @@ -7,6 +7,7 @@
>>>>>>>>>>> @__profn_foo_weak = weak hidden constant [8 x i8] c"foo_weak"
>>>>>>>>>>> @"__profn_linkage.ll:foo_internal" = internal constant [23 x
>>>>>>>>>>> i8] c"linkage.ll:foo_internal"
>>>>>>>>>>> @__profn_foo_inline = linkonce_odr hidden constant [10 x i8]
>>>>>>>>>>> c"foo_inline"
>>>>>>>>>>> + at __profn_foo_extern = linkonce_odr hidden constant [10 x i8]
>>>>>>>>>>> c"foo_extern"
>>>>>>>>>>>
>>>>>>>>>>> ; COMMON: @__profc_foo = hidden global
>>>>>>>>>>> ; COMMON: @__profd_foo = hidden global
>>>>>>>>>>> @@ -36,6 +37,15 @@
>>>>>>>>>>> ret void
>>>>>>>>>>> }
>>>>>>>>>>>
>>>>>>>>>>> +; LINUX: @__profc_foo_extern = linkonce_odr hidden global
>>>>>>>>>>> {{.*}}section "__llvm_prf_cnts", comdat($__profv_foo_extern), align 8
>>>>>>>>>>> +; LINUX: @__profd_foo_extern = linkonce_odr hidden global
>>>>>>>>>>> {{.*}}section "__llvm_prf_data", comdat($__profv_foo_extern), align 8
>>>>>>>>>>> +; OTHER: @__profc_foo_extern = linkonce_odr hidden global
>>>>>>>>>>> +; OTHER: @__profd_foo_extern = linkonce_odr hidden global
>>>>>>>>>>> +define available_externally void @foo_extern() {
>>>>>>>>>>> + call void @llvm.instrprof.increment(i8* getelementptr
>>>>>>>>>>> inbounds ([10 x i8], [10 x i8]* @__profn_foo_extern, i32 0, i32 0), i64 0,
>>>>>>>>>>> i32 1, i32 0)
>>>>>>>>>>> + ret void
>>>>>>>>>>> +}
>>>>>>>>>>> +
>>>>>>>>>>> declare void @llvm.instrprof.increment(i8*, i64, i32, i32)
>>>>>>>>>>>
>>>>>>>>>>> ; OTHER: @__llvm_profile_runtime = external global i32
>>>>>>>>>>> Index: lib/Transforms/Instrumentation/InstrProfiling.cpp
>>>>>>>>>>>
>>>>>>>>>>> ===================================================================
>>>>>>>>>>> --- lib/Transforms/Instrumentation/InstrProfiling.cpp
>>>>>>>>>>> +++ lib/Transforms/Instrumentation/InstrProfiling.cpp
>>>>>>>>>>> @@ -286,8 +286,38 @@
>>>>>>>>>>> return F->hasAddressTaken();
>>>>>>>>>>> }
>>>>>>>>>>>
>>>>>>>>>>> -static inline Comdat *getOrCreateProfileComdat(Module &M,
>>>>>>>>>>> +static inline bool needsComdatForCounter(Function &F, Module
>>>>>>>>>>> &M) {
>>>>>>>>>>> +
>>>>>>>>>>> + if (F.hasComdat())
>>>>>>>>>>> + return true;
>>>>>>>>>>> +
>>>>>>>>>>> + Triple TT(M.getTargetTriple());
>>>>>>>>>>> + if (!TT.isOSBinFormatELF())
>>>>>>>>>>> + return false;
>>>>>>>>>>> +
>>>>>>>>>>> + // See createPGOFuncNameVar for more details. To avoid link
>>>>>>>>>>> errors, profile
>>>>>>>>>>> + // counters for function with available_externally linkage
>>>>>>>>>>> needs to be changed
>>>>>>>>>>> + // to linkonce linkage. On ELF based systems, this leads to
>>>>>>>>>>> weak symbols to be
>>>>>>>>>>> + // created. Without using comdat, duplicate entries won't be
>>>>>>>>>>> removed by the
>>>>>>>>>>> + // linker leading to increased data segement size and raw
>>>>>>>>>>> profile size. Even
>>>>>>>>>>> + // worse, since the referenced counter from profile
>>>>>>>>>>> per-function data object
>>>>>>>>>>> + // will be resolved to the common strong definition, the
>>>>>>>>>>> profile counts for
>>>>>>>>>>> + // available_externally functions will end up being
>>>>>>>>>>> duplicated in raw profile
>>>>>>>>>>> + // data. This can result in distorted profile as the counts
>>>>>>>>>>> of those dups
>>>>>>>>>>> + // will be accumulated by the profile merger.
>>>>>>>>>>> + GlobalValue::LinkageTypes Linkage = F.getLinkage();
>>>>>>>>>>> + if (Linkage != GlobalValue::ExternalWeakLinkage &&
>>>>>>>>>>> + Linkage != GlobalValue::AvailableExternallyLinkage)
>>>>>>>>>>> + return false;
>>>>>>>>>>> +
>>>>>>>>>>> + return true;
>>>>>>>>>>> +}
>>>>>>>>>>> +
>>>>>>>>>>> +static inline Comdat *getOrCreateProfileComdat(Module &M,
>>>>>>>>>>> Function &F,
>>>>>>>>>>>
>>>>>>>>>>> InstrProfIncrementInst *Inc) {
>>>>>>>>>>> + if (!needsComdatForCounter(F, M))
>>>>>>>>>>> + return nullptr;
>>>>>>>>>>> +
>>>>>>>>>>> // COFF format requires a COMDAT section to have a key symbol
>>>>>>>>>>> with the same
>>>>>>>>>>> // name. The linker targeting COFF also requires that the
>>>>>>>>>>> COMDAT
>>>>>>>>>>> // a section is associated to must precede the associating
>>>>>>>>>>> section. For this
>>>>>>>>>>> @@ -315,8 +345,7 @@
>>>>>>>>>>> // linking.
>>>>>>>>>>> Function *Fn = Inc->getParent()->getParent();
>>>>>>>>>>> Comdat *ProfileVarsComdat = nullptr;
>>>>>>>>>>> - if (Fn->hasComdat())
>>>>>>>>>>> - ProfileVarsComdat = getOrCreateProfileComdat(*M, Inc);
>>>>>>>>>>> + ProfileVarsComdat = getOrCreateProfileComdat(*M, *Fn, Inc);
>>>>>>>>>>>
>>>>>>>>>>> uint64_t NumCounters = Inc->getNumCounters()->getZExtValue();
>>>>>>>>>>> LLVMContext &Ctx = M->getContext();
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> _______________________________________________
>>>>>>>>>>> llvm-commits mailing list
>>>>>>>>>>> llvm-commits at lists.llvm.org
>>>>>>>>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20160226/e0feff8b/attachment.html>
More information about the llvm-commits
mailing list