[PATCH] D55864: [elfabi] Write program headers, .dynamic, .dynstr, and .shstrtab

Jake Ehrlich via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Wed May 1 13:37:00 PDT 2019


jakehehrlich marked 6 inline comments as done.
jakehehrlich added a comment.

I can't upload a new patch right now but I plan on making all recommended changes, even the ones I didn't reply to.

In D55864#1485695 <https://reviews.llvm.org/D55864#1485695>, @MaskRay wrote:

> You may have put the rationale somewhere but they are not easily searchable. Please allow me to ask why there is a desire to write yet another ELF emitter. Can't we invest more in existing tools like yaml2obj? Is there any inherent limitation that prevents it from being more dynamic?


Good question. yaml2obj doesn't do the same thing and doesn't exist as a library. The point of this binary is to be as small as possible while maintaining linkability and compatibility with tools like llvm-readobj and other such tools. There is some talk of taking the code in llvm-objcopy, getting it up to where it could be used as a library, making a number of other tweaks, and then using that as a more general purpose emission library. That would also not do the same thing however. The trouble is that every ELF emitter has been designed with a specific purpose in mind and most were not designed as libraries. Furthermore when you consider what each use case has to compute and then specify independent of a general emission library it's not clear you really save much. You have the MC code, you have lld's code, you have llvm-objcopy, you have yaml2obj, etc... but they all have their quirks and specific use cases. the MC emiter has an interface that would be unweildy and imprecise here. llvm-objcopy doesn't help with setting these addresses, can't construct dynamic symbol tables in this way, and assumes that it has access to an input file from which its making modifications, doesn't support adding dynamic symbols yet, and generally makes it hard (on purpose) to tweak the memory image of a file. yaml2obj is really only designed to produce test files and is IMO not in a position to replace this tool. Recent work has given it program header support, dynamic symbol table support, and made it get in the way of specifying specific offsets and addresses when you know what you're doing but it still doesn't rise to the level of something we could use here. Even if it did rise to the level of something we could use here we would just wind up constructing its internal format using the same amount of code only to have it go and do the same basic thing.

Here's a claim. The ELFTypes library *is* the universal emitter already. Each of those tools are performing a particular computation that the others are not except perhaps yaml2obj which is really just working its way towards a low fidelity textual representation of ELF files. There are some commonly repeated things here between all of these libraries that could stand to abstracted out but its actually not the majority of the code in any of these tools or libraries. Until a serious proposal for a universal emitter comes up we keep rewriting those common parts. Still 90% of what's common is already solved by the ELFTypes library



================
Comment at: llvm/llvm/test/tools/llvm-elfabi/binary-write-pheaders.test:18
+# PHDRS-NEXT:   PhysicalAddress: 0x0{{$}}
+# PHDRS-NEXT:   FileSize: {{.*$}}
+# PHDRS-NEXT:   MemSize: {{.*$}}
----------------
MaskRay wrote:
> `{{.*$}}` can just be omitted. Many other tests do this.
> 
> If you want to use `FileCheck --match-full-lines` to omit the regex in `4096{{$}}`, they have to be kept.
Yep. Don't worry about the tests for now. I'll need to update them and they're going to change drastically.


================
Comment at: llvm/llvm/test/tools/llvm-elfabi/binary-write-sheaders.test:60
+# SECTIONS-NEXT:   Index: 3{{$}}
+# SECTIONS-NEXT:   Name: .shstrtab (18)
+# SECTIONS-NEXT:   Type: SHT_STRTAB (0x3)
----------------
MaskRay wrote:
> It'd be great to get rid of `.shstrtab` and just use `.strtab`.
> 
> Newer llvm (since 2015) no longer produces `.shstrtab` as a size optimization.
That's actually what this patch does. These tests are inherited from the previous version of this patch and have not been modified to work yet. After I remove the changes planned part the tests will work. We'll actually need even more tests than what I have here to properly test this since the tests at hand only check a few sections and .dynamic contents.


================
Comment at: llvm/tools/llvm-elfabi/ELFObjHandler.cpp:54
+  mutable Optional<T> Value;
+  mutable bool Blackhole = false;
+
----------------
MaskRay wrote:
> Other than having a mutable `Blackhole`, you can make `Func = []() { llvm_unreachable(...); };` while evaluating the thunk.
Good idea! I think that's actually what GHC does. I'll do that.


================
Comment at: llvm/tools/llvm-elfabi/ELFObjHandler.cpp:523
+  static void Write(uint8_t *Data, const T& Value) {
+    *reinterpret_cast<T*>(Data) = Value;
+  }
----------------
MaskRay wrote:
> `T*` -> `T *`
Yeah there are a bunch of formatting issues. I'll run clang-format on these when I upload next.


================
Comment at: llvm/tools/llvm-elfabi/ELFObjHandler.cpp:583
+    // Define the non-TLS section.
+    DefSec.Name = ".def";
+    DefSec.Shdr = [this](Elf_Shdr& Shdr) {
----------------
MaskRay wrote:
> What is this section used for?
It's the section in which all defined non-TLS sections are defined. It's synthesized as a NOBITS section to avoid using any space. 


================
Comment at: llvm/tools/llvm-elfabi/ELFObjHandler.cpp:599
+    // Define the TLS section.
+    TlsSec.Name = ".tls";
+    TlsSec.Shdr = [this](Elf_Shdr& Shdr) {
----------------
MaskRay wrote:
> Which architecture uses the section name `.tls`?
> 
> Did you mean `.tbss`?
.tbss would imply that this is writable when it isn't. In practice read-only data is already thread safe so there is no need for a read only TLS section. There is also in general no reason to have a read-only NOBITS section since the content is just zeros. This means there aren't good names to use for what I'm doing here. These section names don't matter functionally however. I'm open to many other names. I just went for short names where a name used in the wild didn't exist.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D55864/new/

https://reviews.llvm.org/D55864





More information about the llvm-commits mailing list