[PATCH] D55864: [elfabi] Write program headers, .dynamic, .dynstr, and .shstrtab

Tue Apr 30 19:16:33 PDT 2019

jakehehrlich updated this revision to Diff 197498.
jakehehrlich added a reviewer: plotfi.
jakehehrlich added a comment.
Herald added subscribers: MaskRay, arichardson, emaste.
Herald added a reviewer: espindola.

Ok so this is the rough idea that I have for doing this. I need to add program headers which turns out to require refactoring this a bit. The general idea takes some explanation. There are multiple ways to think about it however. The goal is for everything to "work itself out" automagically. I'll need to add program headers as many bits of functionality are not easily testable here unless we add them. You'll note that I put the NOBITS sections before the other sections. This doesn't jive well with using a single program header to cover them all and since space is a thing we'd like to constrain here we should support it. Unfortunately I did that as optimization as a means of avoiding the dynamic symbol table needing know the addresses of the NOBITS sections before starting. This is overcomeable using an additional loop however. I think an additional loop would be needed even if you hand rolled however and by adding this loop we can probably avoid maintaining an intermediate vector for the dynamic symbol table. Instead we can compute the overall size first, allocate the FileOutputBuffer, and then write directly into it. I'm not sure the complexity or redundancy is justified by the memory savings and I think the time savings will be pretty minimal.

First Way of thinking about it (As a Build System)

You can think of each Lazy value as being a build step. That build step then references other build steps as dependencies. Before completing this build step it goes on to evaluate the other build steps first. As long as no cyclic dependencies show up it all works out. To ensure that there are no cyclic dependencies we have to methods. Method 1) is the Blackhole boolean which if set to true while a Lazy value is being evaluated so that if it ever gets back to itself it can raise an error. The name Blackhole comes from some literature regarding a different way of thinking about it. Method 2) is to assume the minimum possible at each build step. By assuming the minimum possible at each build step you ensure (generally) that you logically can't have any cycles without the idea as a whole being invalid. This is not completely true for lots of little reasons but the main issue is that assumptions produce faster code. Right now I have the number of loops over the symbols down to just 2 which is the minimum I think is possible if you were to hand roll this as is.

Second Way of thinking about it (As futures)

You can think of each Lazy value as a promise to compute that at some point. Each future is then resolve as soon as it can be but it might have to wait on another future. Deadlock occurs when there's a cycle. In fact if we used std::future with delayed evaluation instead of Lazy this would work and we could process the graph in parallel. The graph is quite small at the moment since individual symbols are not members of the graph however so I don't think that would be useful.

Third Way of thinking about (As an instance of the Lob combinator)

My background before working in systems was in programming language theory. There's something called the Lob combinator which can take a self referential spegetti like data structure and actually produce it with links and all. It has many forms and this is one of them. The Lob combinator would evaluate all of the nodes (a node in such a structure is equivalent to a node in a the build graph from Way 1) but we only really need a few. The Lob combinator, in its truely magical form, only works with lazy values however. So I'm just replicating that here. Lazy values are represented by "thunks" in practice. The idea of making a thunk a "blackhole" while its being evaluated to catch cycles comes from some readings I did on GHC's implementation of Haskell which lets it dynamically catch some forms of infinite loops that don't make progress (but not all less it solve the halting problem).

Fourth Way of thinking about it (As as an instance of a build system a la "Build Systems a la carte")

This is like a kind of crappy (or more restricted if you'd rather) version of what they outline in that paper for a dynamic build system. The Lob combinator and build systems are thus then related.

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D55864/new/

https://reviews.llvm.org/D55864

Files:
  llvm/include/llvm/TextAPI/ELF/ELFStub.h
  llvm/llvm/test/tools/llvm-elfabi/binary-write-neededlibs.test
  llvm/llvm/test/tools/llvm-elfabi/binary-write-pheaders.test
  llvm/llvm/test/tools/llvm-elfabi/binary-write-sheaders.test
  llvm/llvm/test/tools/llvm-elfabi/binary-write-soname.test
  llvm/test/tools/llvm-elfabi/binary-write-neededlibs.test
  llvm/test/tools/llvm-elfabi/binary-write-pheaders.test
  llvm/test/tools/llvm-elfabi/binary-write-sheaders.test
  llvm/test/tools/llvm-elfabi/binary-write-soname.test
  llvm/test/tools/llvm-elfabi/invalid-bin-target.test
  llvm/test/tools/llvm-elfabi/missing-bin-target.test
  llvm/test/tools/llvm-elfabi/write-elf32be-ehdr.test
  llvm/test/tools/llvm-elfabi/write-elf32le-ehdr.test
  llvm/test/tools/llvm-elfabi/write-elf64be-ehdr.test
  llvm/test/tools/llvm-elfabi/write-elf64le-ehdr.test
  llvm/tools/llvm-elfabi/ELFObjHandler.cpp
  llvm/tools/llvm-elfabi/ELFObjHandler.h
  llvm/tools/llvm-elfabi/llvm-elfabi.cpp

-------------- next part --------------
A non-text attachment was scrubbed...
Name: D55864.197498.patch
Type: text/x-patch
Size: 39600 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20190501/ba53dffc/attachment-0001.bin>