[llvm-dev] [RFC] (Thin)LTO with Linker Scripts

Tobias von Koch via llvm-dev llvm-dev at lists.llvm.org
Tue May 15 08:51:19 PDT 2018


Hi Peter,

On Mon, May 14, 2018 at 8:14 AM Peter Smith via llvm-dev <
llvm-dev at lists.llvm.org> wrote:

> My understanding from the RFC is:
> - All global objects in the bitcode file will be assigned a section name.
>

... which is equal to the section name that they would have been emitted to
if this was a regular compilation. In addition to allowing the linker to
read section names from the bitcode, this also helps support mixing
-ffunction-sections and -fno-function-sections and similar options (forgot
to mention that in the RFC).

- A linker will communicate the output section of all global objects.


Correct. (Global objects in the LLVM sense, so that includes objects with
local linkage).


> - Certain transformations won't be performed if the output section is
> different.
>

Correct. Plus, others can be enabled if they're safe to apply when we know
things are going to the same output section.


> The common use cases that I can see that might not fit perfectly into
> that model:
> - Code that is in different OutputSections but it will be logically
> correct and in many cases desirable to perform transformations on as
> if they were in the same output section.


Right. The output section that the linker communicates for a symbol doesn't
need to correspond to a "physical" output section. So let's say if the
linker knows (or the user somehow tells it) that two output sections should
be considered equivalent, the linker can communicate the same output
section identifier for symbols in either of the two physical output
sections. This is perfectly safe since the output section info is only ever
used to enable/inhibit optimizations, not for actual symbol emission by LTO.

- Output section placement rules that are not based on names, for
> example Arm's linker can assign sections to an output section until
> the output section size limit is reached, then a different output
> section is used. I admit that this may be more of a problem for
> linkers that have a different linker script model.
>

That should actually just work in the existing model. Before LTO runs, we
don't know the size of symbols anyway, so the linker will just communicate
the original output section for all of them and we apply optimizations
across them as if they all fitted in the same section. After LTO, some may
end up in the 'overflow' section but LTO doesn't need to know about that
since it wouldn't have been correct for the user to make any assumptions
about what ends up in the original section vs overflow in the first place.

I think both cases are illustrative of a use case where the precise
> output section does not matter, but there is a vaguer goal of placing
> a subset of the input sections in a subset of the output sections.
> From what I can tell there isn't a way for the code generator to tell
> the difference between code that is placed in different output
> sections and it is not correct or beneficial to optimize and code that
> is placed in different output sections and it is correct and
> beneficial to optimize together.
>

Perhaps we should rename the "output section" that is communicated to LTO
to something less specific to make it clear that it can be used for exactly
this purpose. Optimization domain? Partition?

I think that this kind of use case could be supported by doing something
> like:
> - Linker informs code generator the output sections that must not use
> any information from another module and may not contribute any
> information to another module. For example an output section that is
> representing an overlay.
>

It's not so much about other modules (files) - you could have multiple
files contributing input sections to the same overlay, for instance, and
you would want to optimize across them. But you wouldn't want to
de-duplicate a constant from another overlay. I think the
OutputSectionID-as-optimization-domain idea captures this use case, no?

- Linker can omit the output section information for sections that the
> user doesn't care where they go, and let the linker decide based on
> some size constraint later.


That's an interesting idea to allow a 'don't care' output section ID; we
would have to be pretty careful in defining what that means on a
per-optimization basis. That is, am I allowed to inline a function with a
defined output section into a function without one (probably no)? Vice
versa (probably yes)?

I think that these are mostly details rather than fundamental problems
> though.
>

Thank you very much for your comments!

Tobias
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20180515/a38cb19d/attachment.html>


More information about the llvm-dev mailing list