[PATCH] D78342: [lld] Add archive file support to Mach-O backend

Kellie Medlin via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Fri May 8 16:41:05 PDT 2020


Ktwu marked 5 inline comments as done.
Ktwu added inline comments.


================
Comment at: lld/MachO/InputFiles.cpp:248-255
+std::vector<InputSection *> const ArchiveFile::getInputSections() {
+  std::vector<InputSection *> sections;
+  for (InputFile *input : resolvedInputs) {
+    auto inputSections = input->getInputSections();
+    sections.insert(sections.end(), inputSections.begin(), inputSections.end());
+  }
+  return sections;
----------------
int3 wrote:
> smeenai wrote:
> > Ktwu wrote:
> > > int3 wrote:
> > > > Kind of unfortunate that we have to return by value here, but I see that lld-ELF's `getInputSections` does something similar.
> > > > 
> > > > Also, did you mean to type this method as `std::vector<InputSection *> ArchiveFile::getInputSections() const {`? I *think* the current placement of the `const` is saying that the return value is `const` instead of saying the method is const...
> > > re: const, yup, thanks!
> > The ELF equivalent is `InputFiles::getSections`, and it returns an `ArrayRef`.
> > 
> > Returning by value wouldn't be an issue here – we're returning a local object, so return value optimization ([NRVO](https://en.cppreference.com/w/cpp/language/copy_elision)) kicks in. It's not ideal for `ObjFile::getInputSections` though, as I commented above.
> Ah, sorry, I was looking at `getInputSections()` in OutputSections.cpp, but I see it's doing something different. And good point about the NRVO, forgot about that.
> 
> Maybe we could make this a `forEachSection(callback)` instead, though I do like the `objectFiles` idea too.
What about keeping the original input sections vector within the Archive file? When loading an object file, that file's sections are cached away.


================
Comment at: lld/MachO/InputFiles.h:59
   static bool classof(const InputFile *f) { return f->kind() == ObjKind; }
+  std::vector<InputSection *> getInputSections() const { return sections; }
+
----------------
smeenai wrote:
> Nit: add `override`
> 
> I'm *really* wary of this returning by value ... that seem like a bunch of unnecessary copying. The ELF equivalent to this function is `InputFile::getSections` (which isn't overridden); it returns an `ArrayRef`, and it asserts that it's only called on object or binary files (and we don't have to worry about binary files). ELF makes that work by having a global `objectFiles` vector (instead of just an `inputFiles` vector), so when it's doing the loop over all input sections, it only loops over object files.
> 
> Whether we wanna have a specific `objectFiles` global vector as well or just stick with `inputFiles` is a broader design question, but we should at least figure out a good way to make this not return by value.
I realize now that I can continue to keep a global sections member; when fetching an ArchiveFile, however, I can immediately fetch each child object's sections and add them to the Archive. No need to worry about a new function in that case.


================
Comment at: lld/MachO/InputFiles.h:70
   static bool classof(const InputFile *f) { return f->kind() == DylibKind; }
+  std::vector<InputSection *> getInputSections() const { return sections; }
 
----------------
smeenai wrote:
> `sections` should always be empty for a DylibFile, so this is confusing. Perhaps just construct and return an empty vector here? (Although that wouldn't work with the ArrayRef comment above, but in that case a comment would be good.)
No longer relevant since I'm just keeping the old private member.


================
Comment at: lld/MachO/SymbolTable.cpp:77
+  std::tie(s, wasInserted) = insert(name);
+
+  if (wasInserted)
----------------
smeenai wrote:
> int3 wrote:
> > Ktwu wrote:
> > > MaskRay wrote:
> > > > How does an archive symbol interact with a DylibSymbol?
> > > By "interact" do you mean if there's a conflict -- a dylib and an archive both expose a symbol with the same name? I assume there shouldn't be issues if symbol names are unique...
> > > 
> > > @int3 you know more about dylibs; do you see potential issues?
> > We definitely need to account for symbol name collisions. I just commented on {D79114}, but basically I've just been feeding various inputs to ld64 and emulating its output, though we should probably figure out the details of its symbol table implementation sometime...
> ld64 uses the same archive file semantics as LLD ELF and COFF, where an archive member can be used to resolve undefined symbol references regardless of its position on the command line.
> 
> A defined symbol in an object file always wins out over a dylib symbol. An archive member that got included in the link is treated as a regular object file for this purpose. If the dylib symbol is seen first, an archive member defining the same symbol won't be pulled out (unless some other symbol from that member is referenced).
> 
> I believe that's the behavior we're implementing here. A LazySymbol won't replace a DylibSymbol, but if the member gets fetched, all its symbols will be added as Defined, which will override DylibSymbols.
> 
> Here's an example to clarify what I mean:
> 
> ```
> $ cat f1.c
> int f() { return 1; }
> 
> $ cat f2.c
> int f() { return 2; }
> 
> $ cat g.c
> int f(void);
> int g() { return f(); }
> 
> $ cat f2g.c
> int f() { return 2; }
> int g() { return f(); }
> 
> $ cat main.c
> int g(void);
> int main() { return g(); }
> 
> $ clang -c f1.c f2.c g.c f2g.c main.c
> $ ld -dylib -o libf1.dylib f1.o
> $ ar crs libf2_g.a f2.o g.o
> $ ar crs libf2g.a f2g.o
> 
> $ ld libf1.dylib libf2_g.a main.o -lSystem
> $ ./a.out; echo $?
> 1 # because libf1.dylib was first in the link line
> 
> $ ld libf2_g.a libf1.dylib main.o -lSystem
> $ ./a.out; echo $?
> 2 # because libf2_g.a was first on the link line
> 
> $ ld libf1.dylib libf2g.a main.o -lSystem
> $ ./a.out; eho $?
> 2 # libf2g.a(f2g.o) was included in the link cos of the reference to g, so the f from there was used
> ```
Neat; I'll turn this research into a test.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D78342/new/

https://reviews.llvm.org/D78342





More information about the llvm-commits mailing list