[cfe-dev] [llvm-dev] Your help needed: List of LLVM Open Projects 2017

Sean Silva via cfe-dev cfe-dev at lists.llvm.org
Mon Jan 16 15:41:50 PST 2017


On Mon, Jan 16, 2017 at 3:35 PM, Mehdi Amini <mehdi.amini at apple.com> wrote:

>
> On Jan 16, 2017, at 3:24 PM, Sean Silva <chisophugis at gmail.com> wrote:
>
>
>
> On Mon, Jan 16, 2017 at 2:07 PM, Mehdi Amini <mehdi.amini at apple.com>
> wrote:
>
>>
>> On Jan 16, 2017, at 1:47 PM, Sean Silva <chisophugis at gmail.com> wrote:
>>
>>
>>
>> On Mon, Jan 16, 2017 at 1:25 PM, Davide Italiano <davide at freebsd.org>
>> wrote:
>>
>>> On Mon, Jan 16, 2017 at 12:31 PM, Sean Silva via llvm-dev
>>> <llvm-dev at lists.llvm.org> wrote:
>>> > Do we have any open projects on LLD?
>>> >
>>> > I know we usually try to avoid any big "projects" and mainly add/fix
>>> things
>>> > in response to user needs, but just wondering if somebody has any
>>> ideas.
>>> >
>>>
>>> I'm not particularly active in lld anymore, but the last big item I'd
>>> like to see implemented is Pettis-Hansen layout.
>>> http://perso.ensta-paristech.fr/~bmonsuez/Cours/B6-4/Article
>>> s/papers15.pdf
>>> (mainly because it improves performances of the final executable).
>>> GCC/gold have an implementation of the algorithm that can be used as
>>> base. I'll expand if anybody is interested.
>>> Side note: I'd like to propose a couple of llvm projects as well, I'll
>>> sit down later today and write them.
>>>
>>
>>
>> I’m not sure, can you confirm that such layout optimization on ELF
>> requires -ffunction-sections?
>>
>
> In order for a standard ELF linker to safely be able to reorder sections
> at function granularity, -ffunction-sections would be required. This isn't
> a problem during LTO since the code generation is set up by the linker :)
>
>
>>
>> Also, for clang on OSX the best layout we could get is to order functions
>> in the order in which they get executed at runtime.
>>
>
> What the optimal layout may be for given apps is a bit of a separate
> question. Right now we're mostly talking about how to plumb everything
> together so that we can do the reordering of the final executable.
>
>
> Yes, I was raising this exactly with the idea of “we may want to try
> different algorithm based on different kind of data”.
>
>
> In fact, standard ELF linking semantics generally require input sections
> to be concatenated in command line order (this is e.g. how
> .init_array/.ctors build up their arrays of pointers to initializers; a
> crt*.o file at the beginning/end has a sentinel value and so the order
> matters). So the linker will generally need blessing from the compiler to
> do most sorts of reorderings as far as I'm aware.
>
> Other signals besides profile info, such as a startup trace, might be
> useful too, and we should make sure we can plug that into the design.
> My understanding of the clang on OSX case is based on a comparison of the
> `form_by_*` functions in clang/utils/perf-training/perf-helper.py which
> offer a relatively simple set of algorithms, so I think the jury is still
> out on the best approach (that script also uses a data collection method
> that is not part of LLVM's usual instrumentation or sampling workflows for
> PGO, so we may not be able to provide the same signals out of the box as
> part of our standard offering in the compiler)
>
>
> Yes, I was thinking that some Xray-based instrumentation could be used to
> provided the same data.
>

I hadn't though of using Xray for this! Good idea! (I haven't been
following Xray very closely; I should look at it more...)

-- Sean Silva


>
> I think that once we have this ordering capability integrated more deeply
> into the compiler, we'll be able to evaluate more complicated algorithms
> like Pettis-Hansen, have access to signals like global profile info, do
> interesting call graph analyses, etc. to find interesting approaches.
>
>
>>
>> For FullLTO it is conceptually pretty easy to get profile data we need
>> for this, but I'm not sure about the ThinLTO case.
>>
>> Teresa, Mehdi,
>>
>> Are there any plans (or things already working!) for getting profile data
>> from ThinLTO in a format that the linker can use for code layout? I assume
>> that profile data is being used already to guide importing, so it may just
>> be a matter of siphoning that off.
>>
>>
>> I’m not sure what kind of “profile information” is needed, and what makes
>> it easier for MonolithicLTO compared to ThinLTO?
>>
>
> For MonolithicLTO I had in mind that a simple implementation would be:
> ```
> std::vector<std::string> Ordering;
> auto Pass = make_unique<LayoutModulePass>(&Ordering);
> addPassToLTOPipeline(std::move(Pass))
> ```
>
> The module pass would just query the profile data directly on IR
> datastructures and get the order out. This would require very little
> "plumbing".
>
>
>>
>> Or maybe that layout code should be inside LLVM; maybe part of the
>> general LTO interface? It looks like the current gcc plugin calls back into
>> gcc for the actual layout algorithm itself (function call
>> find_pettis_hansen_function_layout) rather than the reordering logic
>> living in the linker: https://android.google
>> source.com/toolchain/gcc/+/3f73d6ef90458b45bbbb33ef4c2b174d4
>> 662a22d/gcc-4.6/function_reordering_plugin/function_reordering_plugin.c
>>
>>
>> I was thinking about this: could this be done by reorganizing the module
>> itself for LTO?
>>
>
> For MonolithicLTO that's another simple approach.
>
>
>>
>> That wouldn’t help non-LTO and ThinLTO though.
>>
>
> I think we should ideally aim for something that works uniformly for
> Monolithic and Thin. For example, GCC emits special sections containing the
> profile data and the linker just reads those sections; something analogous
> in LLVM would just happen in the backend and be common to Monolithic and
> Thin. If ThinLTO already has profile summaries in some nice form though, it
> may be possible to bypass this.
>
> Another advantage of using special sections in the output like GCC does is
> that you don't actually need LTO at all to get the function reordering. The
> profile data passed to the compiler during per-TU compilation can be
> lowered into the same kind of annotations. (though LTO and function
> ordering are likely to go hand-in-hand most often for peak-performance
> builds).
>
>
> Yes I agree with all of this :)
> That makes it for interesting design trade-off!
>
>> Mehdi
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20170116/8cef591f/attachment.html>


More information about the cfe-dev mailing list