[PATCH] D20268: [wip] Resolution-based LTO API.

Sat May 14 18:20:05 PDT 2016

pcc added inline comments.

================
Comment at: include/llvm/LTO/LTO.h:26
@@ +25,3 @@
+  unsigned DefinitionInLinkageUnit : 1;
+
+  /// The definition of this symbol is visible outside of the LTO unit.
----------------
joker.eph wrote:
> Thinking about ThinLTO, how do you avoid GOT/PLT accesses for symbols defined in other modules? These attributes seems only relevant for access within the same module, right?
I was under the impression that whatever attribute we decide to use in D20217 could also be applied to declarations.

> Also DefinitionInLinkageUnit would be better named FinalDefinitionInLinkageUnit or something that carry the fact that it is not preemptable at runtime.

Done.

================
Comment at: include/llvm/LTO/LTO.h:49
@@ +48,3 @@
+  }
+
+  class symbol_iterator {
----------------
joker.eph wrote:
> What is specific to LTO in "iterating over the symbol in an object file"?
> I'd like to avoid a large bloated interface here.
It's more about filtering out symbols that LTO clients don't care about, namely local symbols or format-specific symbols such as llvm.used. Because LTO clients (and LTO itself) will need to be able to enumerate the symbol table in multiple places and get the same results I wanted this to be foolproof. I looked at the other non-LTO users of Object's symbols() [1,2] and it looks like they all want something a little different:
- ar only wants global defined symbols.
- nm, objdump etc. want every symbol
- lld only uses this API for LTO
So the actual filter used here does seem a little LTO specific.

[1] http://llvm-cs.pcc.me.uk/include/llvm/Object/ObjectFile.h/rsymbols
[2] http://llvm-cs.pcc.me.uk/include/llvm/Object/SymbolicFile.h/rsymbols

================
Comment at: include/llvm/LTO/LTO.h:89
@@ +88,3 @@
+           DiagnosticHandlerFunction DiagHandler);
+
+  /// The following callbacks deal with tasks, which normally represent the
----------------
joker.eph wrote:
> The point is about splitting the API in multiple abstractions instead of a big blob here.
> I was planning to have an LTOModule that expose the possibility for the linker to iterate over symbols, keep a reference to a symbols, and update the symbol resolution using the symbol reference directly.
> The linker can mark the symbol resolution the first time it sees it in many cases.
> 
> 
> I was planning to have an LTOModule that expose the possibility for the linker to iterate over symbols, keep a reference to a symbols, and update the symbol resolution using the symbol reference directly.

I considered going that way, but the problem is that that isn't necessarily compatible with how every linker works. In the gold plugin for example, because of how memory is managed in gold, we have to load an IRObjectFile to get the symbol list, throw it away and load it again later to do symbol resolution and linking. Trying to keep a persistent LTOModule seems like more trouble than it's worth in that case.

We might want some lighter-weight interface that could just store the resolutions, but I don't see how that would be substantially different from just `std::vector<SymbolResolution>`.

================
Comment at: include/llvm/LTO/LTO.h:96
@@ +95,3 @@
+  /// entire pipeline:
+  /// - A task may be split into multiple tasks. For example, parallel code
+  ///   generation will be represented as a single task during optimization,
----------------
joker.eph wrote:
> I see, however instead of a callback that callback in the callback .... model, I'd rather split it so that:
> 
> - linker calls run()
> - the (Thin)LTO processing call something like `ObjectFile allocateNewObjectFile()`
> - the //ObjectFile// offers a `getStream()` interface, as well as a `finalize()` or similar to signal completeness.
I found a way to avoid the callback so that the code in the linker can normally just return a stream if it doesn't need anything to happen after code generation. PTAL.

================
Comment at: include/llvm/LTO/LTO.h:123
@@ +122,3 @@
+  ///
+  /// File callbacks must be thread safe.
+  typedef std::function<void(size_t Task, StringRef S)> AddFileFn;
----------------
joker.eph wrote:
> Can you elaborate what you mean? I don't really what you mean, so I don't get the need for task numbering
See what the gold plugin does for example. I create a vector of object file names of size `getMaxTasks()`, then call `run()`. Each callback generates a native object and stores the file name in an element of a vector indexed by the task ID. One can more easily conclude that that part of the code is thread safe because each task is operating on a different element of the vector. Once `run()` returns, I iterate over the vector to add all native objects to the link.

If the linker could expect an unbounded number of object files, I would need to arrange for the object file list to grow while it may be accessed concurrently. That would be tricky to get right, and something equivalent would probably need to be done in every linker. Also, if I did not have task numbers, I could not ensure determinism in the final output because the callbacks for the individual tasks may be called in any order.

================
Comment at: include/llvm/LTO/LTO.h:124
@@ +123,3 @@
+  /// File callbacks must be thread safe.
+  typedef std::function<void(size_t Task, StringRef S)> AddFileFn;
+
----------------
joker.eph wrote:
> I forgot to ask: why the "File" version? Why not passing the file as a "stream"?
To implement the file callback in terms of the stream callback, LTO would need to copy the contents of the file to the stream. That wouldn't allow the linker to mmap the original file for example.

================
Comment at: include/llvm/LTO/LTO.h:138
@@ +137,3 @@
+
+  /// Runs the LTO pipeline. This function calls the supplied AddStream and
+  /// AddFile functions to add native object files to the link.
----------------
joker.eph wrote:
> These should all be moved to a different layer somehow.
I will have to see if I can find a good way to layer. LTO and ThinLTO will need to interact to some degree (e.g. if a program contains both regular LTO and ThinLTO modules, a use in a ThinLTO module would force a de-internalization in regular LTO) and I'm not sure if that can be done without a confusing set of interactions between layers.

http://reviews.llvm.org/D20268