[PATCH] D36351: [lld][ELF] Add profile guided section layout

Michael Spencer via llvm-commits llvm-commits at lists.llvm.org
Thu Oct 26 19:50:49 PDT 2017


On Tue, Oct 24, 2017 at 9:30 PM, Rui Ueyama <ruiu at google.com> wrote:

> Unfortunately, even with --full-shutdown, the result was the same.
>
>
I tested this out and it worked for me with --full-shutdown while linking a
trivial object file. One thing I forgot to mention is that you need to also
compile the compiler-rt profile runtime at the same version as clang and
llvm. It's probably falling back to an incompatible system version.

- Michael Spencer


> On Tue, Oct 24, 2017 at 9:25 PM, Rui Ueyama <ruiu at google.com> wrote:
>
>> On Tue, Oct 24, 2017 at 9:23 PM, Michael Spencer <bigcheesegs at gmail.com>
>> wrote:
>>
>>> On Tue, Oct 24, 2017 at 8:58 PM, Rui Ueyama <ruiu at google.com> wrote:
>>>
>>>> Sorry, as you pointed out, I made a mistake when I applied the patches.
>>>> Now I succeeded to build it. But the output file of llvm-profdata seems too
>>>> small. Is this correct?
>>>>
>>>> This is what I did.
>>>>
>>>> $ CC=`which clang` CXX=`which clang++` /ssd/cmake/bin/cmake -GNinja
>>>> -DCMAKE_BUILD_TYPE=RelWithDebInfo -DLLVM_ENABLE_PROJECTS='clang;lld'
>>>> -DCMAKE_C{,XX}_FLAGS=-fprofile-instr-generate ../llvm-project/llvm/
>>>>
>>>> $ ninja clang lld
>>>>
>>>> $ bin/ld.lld <options-to-link-clang>
>>>>
>>>> $ bin/llvm-profdata merge default.profraw -o default.profdata
>>>>
>>>> $ ls -l default.*
>>>> -rw-r----- 1 ruiu eng     560 Oct 24 20:55 default.profdata
>>>> -rw-r----- 1 ruiu eng 2151240 Oct 24 20:55 default.profraw
>>>>
>>>> $ bin/llvm-profdata show default.profraw
>>>> error: default.profraw: Empty raw profile file
>>>>
>>>
>>> It could be that lld is using quick exit and not properly flushing the
>>> profile data to disk. I'll test out this configuration.
>>>
>>
>> Ah, it's likely. I'll try again with `--full-shutdown` option.
>>
>>
>>> - Michael Spencer
>>>
>>>
>>>>
>>>> On Mon, Oct 23, 2017 at 7:52 PM, Michael Spencer <bigcheesegs at gmail.com
>>>> > wrote:
>>>>
>>>>> On Mon, Oct 23, 2017 at 6:53 PM, Rui Ueyama <ruiu at google.com> wrote:
>>>>>
>>>>>> It didn't pass cmake because lld/ELF/CMakeLists.txt lacked
>>>>>> "CallGraphSort.cpp", so I added that to the CMakeLists.txt.
>>>>>>
>>>>>> It still didn't compile because `Config::CallGraphProfile` doesn't
>>>>>> exist, so I added that boolean variable.
>>>>>>
>>>>>> /ssd/llvm-project/lld/ELF/CallGraphSort.cpp:274:29: error: no member
>>>>>> named 'CallGraphProfile' in 'lld::elf::Configuration'
>>>>>>   CallGraphSort CGS(Config->CallGraphProfile);
>>>>>>                       ~~~~~~  ^
>>>>>>
>>>>>> Now, I'm seeing the following errors. How can I fix them?
>>>>>>
>>>>>> /ssd/llvm-project/lld/ELF/CallGraphSort.cpp:274:17: error: no
>>>>>> matching constructor for initialization of '(anonymous
>>>>>> namespace)::CallGraphSort'
>>>>>>   CallGraphSort CGS(Config->CallGraphProfile);
>>>>>>                 ^   ~~~~~~~~~~~~~~~~~~~~~~~~
>>>>>> /ssd/llvm-project/lld/ELF/CallGraphSort.cpp:43:7: note: candidate
>>>>>> constructor (the implicit copy constructor) not viable: no known conversion
>>>>>> from 'bool' to 'const (anonymous namespace)::CallGraphSort' for 1st argument
>>>>>> class CallGraphSort {
>>>>>>       ^
>>>>>> /ssd/llvm-project/lld/ELF/CallGraphSort.cpp:43:7: note: candidate
>>>>>> constructor (the implicit move constructor) not viable: no known conversion
>>>>>> from 'bool' to '(anonymous namespace)::CallGraphSort' for 1st argument
>>>>>> class CallGraphSort {
>>>>>>       ^
>>>>>> /ssd/llvm-project/lld/ELF/CallGraphSort.cpp:115:16: note: candidate
>>>>>> constructor not viable: no known conversion from 'bool' to
>>>>>> 'DenseMap<std::pair<const SymbolBody *, const SymbolBody *>, uint64_t> &'
>>>>>> (aka 'DenseMap<pair<const lld::elf::SymbolBody *, const
>>>>>> lld::elf::SymbolBody *>, unsigned long> &') for 1st argument
>>>>>> CallGraphSort::CallGraphSort(
>>>>>>                ^
>>>>>>
>>>>>>
>>>>> Are you sure you applied the patch correctly? I just checked the diff
>>>>> and it has all that.
>>>>>
>>>>> diff --git a/ELF/CMakeLists.txt b/ELF/CMakeLists.txt
>>>>> index 205702975..3b3c388d2 100644
>>>>> --- a/ELF/CMakeLists.txt
>>>>> +++ b/ELF/CMakeLists.txt
>>>>> @@ -18,6 +18,7 @@ add_lld_library(lldELF
>>>>>    Arch/SPARCV9.cpp
>>>>>    Arch/X86.cpp
>>>>>    Arch/X86_64.cpp
>>>>> +  CallGraphSort.cpp
>>>>>    Driver.cpp
>>>>>    DriverUtils.cpp
>>>>>    EhFrame.cpp
>>>>>
>>>>> It also has the change that adds CallGraphProfile to Config.h.
>>>>>
>>>>> - Michael Spencer
>>>>>
>>>>>
>>>>>> On Mon, Oct 23, 2017 at 5:02 PM, Michael Spencer <
>>>>>> bigcheesegs at gmail.com> wrote:
>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> - Michael Spencer
>>>>>>>
>>>>>>> On Tue, Oct 3, 2017 at 6:43 PM, Rui Ueyama via Phabricator <
>>>>>>> reviews at reviews.llvm.org> wrote:
>>>>>>>
>>>>>>>> ruiu added a comment.
>>>>>>>>
>>>>>>>> Could you send me a patch to produce a call graph file so that I
>>>>>>>> can try this patch on my machine?
>>>>>>>>
>>>>>>>
>>>>>>> Sorry for the delay. I've attached the llvm and lld patches that
>>>>>>> implement a full testable version of the feature.
>>>>>>>
>>>>>>> To use:
>>>>>>>
>>>>>>> -1) Have an elf system with working clang instrumentation based
>>>>>>> profiling
>>>>>>> 0) Compile clang and lld with the supplied patch
>>>>>>> 1) Compile the code with `-fprofile-instr-generate` and link with
>>>>>>> any linker
>>>>>>> 2) Run the program on a representative sample
>>>>>>> 3) `$ llvm-profdata merge default.profraw -o default.profdata`
>>>>>>> 4) Compile the code again with `-fprofile-instr-use=default.profdata
>>>>>>> -ffunction-sections -fuse-ld=lld`
>>>>>>>
>>>>>>> The output of #4 is the program with sections ordered by profile
>>>>>>> data. You can add -Wl,-no-call-graph-profile-sort to disable
>>>>>>> sorting to measure the difference.
>>>>>>>
>>>>>>> `llvm-readobj -elf-cg-profile` will dump the cg profile section.
>>>>>>>
>>>>>>> This also works with LTO.
>>>>>>>
>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> ================
>>>>>>>> Comment at: ELF/CallGraphSort.cpp:99-101
>>>>>>>> +  if (To != Other.To)
>>>>>>>> +    return To < Other.To;
>>>>>>>> +  return false;
>>>>>>>> ----------------
>>>>>>>> You can just return `To < Other.To`.
>>>>>>>>
>>>>>>>>
>>>>>>>> ================
>>>>>>>> Comment at: ELF/CallGraphSort.cpp:127
>>>>>>>> +  // Create the graph.
>>>>>>>> +  for (const auto &C : Profile) {
>>>>>>>> +    if (C.second == 0)
>>>>>>>> ----------------
>>>>>>>> This loop is a bit too dense. It cannot be understood without
>>>>>>>> reading each line carefully as I don't understand the whole picture. Please
>>>>>>>> insert a blank line between code blocks. Adding more comment would help.
>>>>>>>>
>>>>>>>>
>>>>>>>> ================
>>>>>>>> Comment at: ELF/CallGraphSort.cpp:128
>>>>>>>> +  for (const auto &C : Profile) {
>>>>>>>> +    if (C.second == 0)
>>>>>>>> +      continue;
>>>>>>>> ----------------
>>>>>>>> Please define local variables for `C.first.first`, `C.first.second`
>>>>>>>> and `C.second` so that they are accessed through meaningful names.
>>>>>>>>
>>>>>>>>
>>>>>>>> ================
>>>>>>>> Comment at: ELF/CallGraphSort.cpp:130-135
>>>>>>>> +    auto FromDR = dyn_cast_or_null<DefinedRegula
>>>>>>>> r>(Symtab->find(C.first.first));
>>>>>>>> +    auto ToDR = dyn_cast_or_null<DefinedRegula
>>>>>>>> r>(Symtab->find(C.first.second));
>>>>>>>> +    if (!FromDR || !ToDR)
>>>>>>>> +      continue;
>>>>>>>> +    auto FromSB = dyn_cast_or_null<const
>>>>>>>> InputSectionBase>(FromDR->Section);
>>>>>>>> +    auto ToSB = dyn_cast_or_null<const
>>>>>>>> InputSectionBase>(ToDR->Section);
>>>>>>>> ----------------
>>>>>>>> auto -> auto *
>>>>>>>>
>>>>>>>>
>>>>>>>> ================
>>>>>>>> Comment at: ELF/CallGraphSort.cpp:147
>>>>>>>> +      Nodes[To].IncidentEdges.push_back(EI);
>>>>>>>> +    } else
>>>>>>>> +      Edges[EI].Weight = SaturatingAdd(Edges[EI].Weight,
>>>>>>>> C.second);
>>>>>>>> ----------------
>>>>>>>> nit: add {}
>>>>>>>>
>>>>>>>>
>>>>>>>> ================
>>>>>>>> Comment at: ELF/CallGraphSort.cpp:153
>>>>>>>> +
>>>>>>>> +void CallGraphSort::contractEdge(EdgeIndex CEI) {
>>>>>>>> +  // Make a copy of the edge as the original will be marked killed
>>>>>>>> while being
>>>>>>>> ----------------
>>>>>>>> Please add a function comment as to what this function is intended
>>>>>>>> to do. I do not understand this function because I don't get a whole
>>>>>>>> picture.
>>>>>>>>
>>>>>>>>
>>>>>>>> ================
>>>>>>>> Comment at: ELF/CallGraphSort.cpp:158-163
>>>>>>>> +  // Remove the self edge from From.
>>>>>>>> +  FE.erase(std::remove(FE.begin(), FE.end(), CEI));
>>>>>>>> +  std::vector<EdgeIndex> &TE = Nodes[CE.To].IncidentEdges;
>>>>>>>> +  // Update all edges incident with To to reference From instead.
>>>>>>>> Then if they
>>>>>>>> +  // aren't self edges add them to From.
>>>>>>>> +  for (EdgeIndex EI : TE) {
>>>>>>>> ----------------
>>>>>>>> Add blank lines before comments.
>>>>>>>>
>>>>>>>>
>>>>>>>> ================
>>>>>>>> Comment at: ELF/CallGraphSort.cpp:165-166
>>>>>>>> +    Edge &E = Edges[EI];
>>>>>>>> +    // E.From = E.From == CE.To ? CE.From : E.From;
>>>>>>>> +    // E.To = E.To == CE.To ? CE.From : E.To;
>>>>>>>> +    if (E.From == CE.To)
>>>>>>>> ----------------
>>>>>>>> Please remove debug code.
>>>>>>>>
>>>>>>>>
>>>>>>>> ================
>>>>>>>> Comment at: ELF/CallGraphSort.cpp:178-180
>>>>>>>> +  // Free memory.
>>>>>>>> +  std::vector<EdgeIndex>().swap(TE);
>>>>>>>> +
>>>>>>>> ----------------
>>>>>>>> This looks odd. Why do you need to do this? I think you can just
>>>>>>>> leave it alone.
>>>>>>>>
>>>>>>>>
>>>>>>>> ================
>>>>>>>> Comment at: ELF/CallGraphSort.cpp:207-208
>>>>>>>> +
>>>>>>>> +// Group InputSections into clusters using the Call-Chain
>>>>>>>> Clustering heuristic
>>>>>>>> +// then sort the clusters by density.
>>>>>>>> +void CallGraphSort::generateClusters() {
>>>>>>>> ----------------
>>>>>>>> This might be understood for those who read the paper, but I don't
>>>>>>>> think that is enough. Please write more comment as to what are clusters,
>>>>>>>> what is density, and what is the heuristic.
>>>>>>>>
>>>>>>>>
>>>>>>>> ================
>>>>>>>> Comment at: ELF/CallGraphSort.cpp:272-273
>>>>>>>> +DenseMap<const InputSectionBase *, int>
>>>>>>>> elf::computeCallGraphProfileOrder() {
>>>>>>>> +  CallGraphSort CGS(Config->CallGraphProfile);
>>>>>>>> +  return CGS.run();
>>>>>>>> +}
>>>>>>>> ----------------
>>>>>>>> You can do this in one line without defining a local variable.
>>>>>>>>
>>>>>>>>
>>>>>>>> ================
>>>>>>>> Comment at: ELF/Writer.cpp:926-928
>>>>>>>> +    for (BaseCommand *Base : Script->Opt.Commands)
>>>>>>>> +      if (auto *OS = dyn_cast<OutputSection>(Base))
>>>>>>>> +        if (OS->Name == ".text") {
>>>>>>>> ----------------
>>>>>>>> I'd factor this code out as `OutputSection
>>>>>>>> *findOutputSection(StringRef Name)`.
>>>>>>>>
>>>>>>>>
>>>>>>>> https://reviews.llvm.org/D36351
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20171026/2b16b840/attachment.html>


More information about the llvm-commits mailing list