[PATCH] D64281: [Object] Create MutableELFObject Class for Doing Mutations on ELFObjectFiles [Part 1]

Alex Brachet via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Wed Jul 31 23:10:17 PDT 2019


abrachet marked 14 inline comments as done.
abrachet added a comment.

Again responding to your comment @jhenderson but not quoting due to length.

I'll try to explain `MutableRange` further and why certain decisions were made, upon reading my previous response I don't think I explained well. I fear that this spells the writing on the wall for the current implementation because if I need to explain it isn't written very well and won't be maintainable!

Again for convenience I'll talk about sections. When it is initially constructed the `std::vector<T> NewValues` has a size of 0, the `std::vector<MappingType> Mappings` has as a size of as many sections in the original file. The `MappingTypes` will look like this: `{.IsNew = false, .Ptr = <pointer into the file>}`, but on modification that mapping looks like `{.IsNew = true, .Ptr = <index into NewValues>}`. This means that for unmodified sections there is an overhead of `sizeof(MappingType)`, not `sizeof(T)` so doing modifications on only some sections //should// be faster, or at least there is less copying and uses less memory.

Of course there's no arguing that it leads to what is pretty ugly (and obviously causes a lot of confusion) solution. To get to the `Elf_Shdr*` from a new section there is a lot of indirection. DataRef => Mapping => NewValues => Elf_Shdr. And then for an unmodified one it is DataRef => Mapping => Elf_Shdr.

I still think that modified sections cannot be just an `Elf_Shdr` and there isn't a clean way to hold either an `Elf_Shdr` or a `MutableELFSection`. I had played around with a union, which didn't work because `MutableELFSection` had a non trivial copy constructor, but this still requires that the uses of `MutableRange` know to distinguish between the original type and the modified type. I will also say that although it is ugly, and it would be ideal if what you call the plumbing was inside of `MutableRange`, that plumbing exists in `MutableELFObject`, but it isn't exposed outside of the class. `MutableELFObject` abstracts this and works the same way as its predecessors.



================
Comment at: llvm/include/llvm/Object/ELFObjectFile.h:801
   const Elf_Shdr *EShdr = getSection(Sec);
-  return !isBerkeleyText(Sec) && EShdr->sh_type != ELF::SHT_NOBITS &&
-         EShdr->sh_flags & ELF::SHF_ALLOC;
+  return !ELFObjectFile<ELFT>::isBerkeleyText(Sec) &&
+          EShdr->sh_type != ELF::SHT_NOBITS &&
----------------
jhenderson wrote:
> abrachet wrote:
> > jhenderson wrote:
> > > Are you sure you want to be calling the base-class method though? Isn't that going to go wrong if Sec is a modified or added section?
> > When these functions are called, they get called with a DataRef pointing to the Elf_Shdr whether they are new or not. I was segfaulting here before because in `ELFObjectFile` the DataRef's point to the Elf_Shdr but in `MutableELFObject` they are indexes into the `MutableRange`. I couldn't find a way for the DataRefs to be interpreted the same way between the two classes. This isn't ideal though.
> Oh, I realised I was mistaken as to which class we are in. I really don't like this change here at all. The base class shouldn't need to know that it has to call the base class version of the method. It should just work. Effectively the design here now means the base class has to know that there's a subclass that might change the meaning of some of its functions under-the-hood, and guard against that.
> 
> The solution might be to make getSection a virtual method and override it in your sub-class. What do you think of that?
> What do you think of that?
I think that saves so much boiler plate code! Can't believe I didn't see this earlier thank you. Also, I remember @jakehehrlich saying how I was previously doing it was not good, this is much better now. I now only have to override `moveSectionNext`, `getSectionName`, `getSectionIndex` and `getSectionContents`


================
Comment at: llvm/include/llvm/Object/MutableELFObject.h:134
+      : ELFObjectFile<ELFT>(std::move(B)),
+        Sections(B.section_begin(), B.section_end(),
+                 [&](SectionRef Ref) { return Ref.getRawDataRefImpl().p; }) {}
----------------
rupprecht wrote:
> I think this is a potentially-invalid reference, as `B` is `std::move`-d for the previous member. It looks like this works because the move constructor for `ELFObjectFile` is implemented as a copy constructor, although that is unusual (a typical implementation would swap() members):
> 
> ```
> template <class ELFT>
> ELFObjectFile<ELFT>::ELFObjectFile(ELFObjectFile<ELFT> &&Other)
>     : ELFObjectFile(Other.Data, Other.EF, Other.DotDynSymSec,
>                     Other.DotSymtabSec, Other.ShndxTable) {}
> ```
> 
> I'm curious if you can test this theory by "fixing" the move constructor to be something like:
> ```
> template <class ELFT>
> ELFObjectFile<ELFT>::ELFObjectFile(ELFObjectFile<ELFT> &&Other) {
>   using std::swap;
>   swap(Data, Other.Data);
>   swap(EF, Other.EF);
>   swap(DotDynSymSec, Other.DotDynSymSec);
>   swap(DotSymtabSec, Other.DotSymtabSec);
>   swap(ShndxTable, Other.ShndxTable);
> }
> ```
> (no need to check it in though)
> 
> I think you can just use `section_begin()/section_end()` directly here, and it will correctly call the methods from the parent class (the vtable doesn't update until later)
I couldn't test to see if it breaks with your example of swapping members because `ELFObjectFile` has no default constructor so all I could use were Others members to initialize and then the swap is just swapping two equal values.

> I think you can just use section_begin()/section_end() directly here, and it will correctly call the methods from the parent class (the vtable doesn't update until later)
This didn't work for me, my program never exited (I gave it ~10 seconds assuming a segfault was coming). I am still using `B.section_begin()/end()`. I also tried `ELFObjectFile<ELFT>::section_begin()` and that compiled but caused a crash. Tried to create the OwningArrayRef with nonsense data `OwningArrayRef(this=0x0000000106006280, Size=72057594037927936)`
```
ObjectTests(66699,0x10c97d5c0) malloc: can't allocate region
*** mach_vm_map(size=72057594037927936) failed (error code=3)
ObjectTests(66699,0x10c97d5c0) malloc: *** set a breakpoint in malloc_error_break to debug
libc++abi.dylib: terminating with uncaught exception of type std::bad_alloc: std::bad_alloc
```



CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D64281/new/

https://reviews.llvm.org/D64281





More information about the llvm-commits mailing list