[PATCH] D116769: [ifs] Allow llvm-ifs to generate text stub from elf stub

James Henderson via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue Feb 1 04:46:35 PST 2022


jhenderson added a comment.

In D116769#3270596 <https://reviews.llvm.org/D116769#3270596>, @haowei wrote:

> If there is no further objection or comments, can I get a LGTM to land this change please?

Sorry, I've been off for two weeks, and only now getting back to these things.



================
Comment at: llvm/lib/InterfaceStub/ELFObjHandler.cpp:517-518
+
+  if (Type != ELF::SHT_STRTAB && Type != ELF::SHT_DYNSYM)
+    return createError("unknown type supplied");
+  // Look for DynSym from section headers.
----------------
This should be an `assert`.


================
Comment at: llvm/lib/InterfaceStub/ELFObjHandler.cpp:524
+
+  if (Shdrs.get().size()) {
+    for (const Elf_Shdr &Sec : *Shdrs) {
----------------
No need for this if: the loop will do nothing if `Shdrs` is empty.


================
Comment at: llvm/lib/InterfaceStub/ELFObjHandler.cpp:535-537
+        } else {
+          return ElfFile.base() + Sec.sh_offset;
+        }
----------------
No need for `else` after `return`


================
Comment at: llvm/lib/InterfaceStub/ELFObjHandler.cpp:545
+    std::string ErrMsg("when locating ");
+    ErrMsg += (Type == ELF::SHT_STRTAB ? ".dynstr" : ".dynsym");
+    ErrMsg += " section contents";
----------------
I'd probably refer to them as "dynamic string table" and "dynamic symbol table", rather than by section names, since they could be called other things.


================
Comment at: llvm/lib/InterfaceStub/ELFObjHandler.cpp:549
+  }
+  if (Shdrs.get().size()) {
+    for (const Elf_Shdr &Sec : *Shdrs) {
----------------
haowei wrote:
> jhenderson wrote:
> > haowei wrote:
> > > jhenderson wrote:
> > > > jhenderson wrote:
> > > > > Noting that most of this function is almost identical to above, so probably you want to have a function that finds a dynsym section and returns it, so that these two functions can work on them as appropriate.
> > > > How about having a function that takes a lambda "OnFound", to share the majority of the logic between these two functions (that may or may not then be sensible to add to the ElfFile interface). Approximate untested implementation and usage:
> > > > 
> > > > ```
> > > > template <class ELFT>
> > > > Expected<const uint8_t*> getSectionFromDynsym(const ElfFile<ELFT> &ElfFile, ELFT::Elf_Addr DynAddr, StringRef Name, function_ref<Expected<const uint8_t *>(const ELFT::Shdr &, const ELFT::ShdrRange &) OnFound) {
> > > >     using Elf_Shdr_Range = typename ELFT::ShdrRange;
> > > >   using Elf_Shdr = typename ELFT::Shdr;
> > > > 
> > > >   Expected<Elf_Shdr_Range> Shdrs = ElfFile.sections();
> > > >   if (!Shdrs)
> > > >     return Shdrs.takeError();
> > > > 
> > > >   if (Shdrs.get().size()) {
> > > >     for (const Elf_Shdr &Sec : *Shdrs) {
> > > >       if (Sec.sh_type == ELF::SHT_DYNSYM) {
> > > >         // If multiple .dynsym are present, use the first one.
> > > >         // This behavior aligns with llvm::object::ELFFile::getDynSymtabSize().
> > > >         return OnFound(Sec, *Shdrs);
> > > >       }
> > > >     }
> > > >   }  // Fallback to use offsets from .dynamic as .dynsym was not found.
> > > >   Expected<const uint8_t *> PtrOrErr =
> > > >       ElfFile.toMappedAddr(DynAddr);
> > > >   if (!PtrOrErr)
> > > >     return appendToError(PtrOrErr.takeError(),
> > > >                          "when locating " + Name + " section contents");
> > > >   return *PtrOrErr;
> > > > }
> > > > 
> > > > template <class ELFT>
> > > > static Expected<const uint8_t *> getDynSymPtr(const ELFFile<ELFT> &ElfFile,
> > > >                                               const DynamicEntries &DynEnt) {
> > > >   return getSectionFromDynsym(ElfFile, DynEnt.DynSymAddr, "dynamic symbol table", [&ElfFile](const ELFT::Elf_Shdr &Sec, const ELFT::ShdrRange &Shdrs){ return ElfFile.base() + Sec.sh_offset(); });
> > > > }
> > > > 
> > > > template <class ELFT>
> > > > static Expected<StringRef> getDynStr(const ELFFile<ELFT> &ElfFile,
> > > >                                               const DynamicEntries &DynEnt) {
> > > >   Expected<StringRef> ResultOrErr = getSectionFromDynsym(ElfFile, DynEnt.DynStrAddr, "dynamic string table", [&ElfFile](const ELFT::Elf_Shdr &Sec, const ELFT::ShdrRange &Shdrs){ return ElfFile.getStringTableForSymtab(Sec, *Shdrs); });
> > > >   if (!ResultOrErr)
> > > >     return ResultOrErr;
> > > >   return StringRef(reinterpret_cast<const char *>(*ResultOrErr), DynEnt.StrSize);
> > > > }
> > > > ```
> > > > I've also made some more idiomatic changes to the names and usage of the `Expected` at the end, which you should probably adopt anyway (specifically using "*OrErr" for the variable name, and using `operator*` to get the real value). Also, as a related aside, I'd avoid using explicitly ".dynsym" and ".dynstr" as names if you're looking up by type, as there is a chance these sections aren't named that. My example code should illustrate it better, hopefully.
> > > Thanks for providing this code snippet, I modified it and finished a version that passed all the test: https://gist.github.com/zeroomega/9552161efca45ad8af17a3f312582753
> > > 
> > > Unfortunately I don't think it can achieve the goal we want. IMO the lambda solution is a bit less readable. And it is about 5lines longer if you strip the comments. I think the cause is that getDynStr and getDynSym has 2 different type of return values. The first one returns a StringRef while the other one returns a raw pointer. The API `getStringTableForSymtab` used in getDynStr has the same return type. If both function used a common helper function, the runtion will need to return raw pointer, and that is where all the conversion needed. If you look at [L50](https://gist.github.com/zeroomega/9552161efca45ad8af17a3f312582753#file-lambda_solution-cpp-L50), I have to convert StringRef back to raw pointer to accommodate that. And I have to check the `Expect<T>` before the conversion, otherwise I am getting `Expected<T> must be checked before access or destruction.` error.
> > > 
> > > 
> > My main goal is removing the near-identical code between these two functions, as duplicated code will lead to more maintenance problems down the line. I have two ideas to resolve this:
> > 
> > 1) We could solve this issue by changing the signature of `getDynSymPtr` to return an `Expected<StringRef>`. A cast could be added to convert within the lambda, if needed, but it doesn't make it significantly more complex. The call site would need a small tweak too, but it could happen after the `Expected` has been checked (a quick look shows there's already a cast, so it just would require addition of a `.data()` call somewhere when doing the existing cast).
> > 
> > 2) Another idea I had would be to somehow template the return type. I believe this isn't an option using plain old functions, but you could wrap it in a templated class, I believe, and make the function a member of that class.
> > 
> > I'm not sure which would look cleaner overall, but both should work, and resolve the duplicated code.
> For suggestion 1. That is a hard no for me. DynSym data are binary I don't think it is wise to fill it into a StringRef which is suppose to just host references to strings.  For suggestion 2. I would like to avoid use template class type for return value. It does not help the readability and makes the problem complicated. Instead, I implemented a solution in Diff 9, in which I reuse the code as best as I can.  Please take a look.
> 
> I am still lean to Diff 8 since it is more straight forward and easier to understand. But if you are insist on no code should be repeated, I am fine with the solution in Diff 9.
> That is a hard no for me. DynSym data are binary I don't think it is wise to fill it into a StringRef which is suppose to just host references to strings.

Fine. Use a return type of `ArrayRef<char>`, and use `data() and size()` to construct the value then... A StringRef is a reference to an array of chars, so it's really the same... I'm pretty sure `StringRef` is often used in that context, even for binary data.

The current implementation in this patch isn't terrible, but it's not particularly nice either: the `Type` argument turns the function into a bi-modal function, so this function is being used to do two different things, which isn't really good software design. I think we can do better.

Latest alternate idea: have a class (perhaps simply called "DynSym" or similar), which during initialisation looks up the dynsym section header and stores it in a pointer, leaving it as null if not found. Then make the `getDynSym` and `getDynStr` functions member functions, which simply use the precalculated info to return the data as appropriate. This way, the shared data (the dynsym section header) is only looked up once. The two separate methods do only the behaviour that is specific to the relevant table you're looking up. Untested implementation:

```
class DynSym {
  using Elf_Shdr_Range = typename ELFT::ShdrRange;
  using Elf_Shdr = typename ELFT::Shdr;

public:
  static Expected<DynSym> create(const ELFFile<ELFT> &ElfFile, const DynamicEntries &DynEnt) {
      Expected<Elf_Shdr_Range> Shdrs = ElfFile.sections();
      if (!Shdrs)
        return Shdrs.takeError();
      return DynSym(ElfFile, DynEnt, *Shdrs);
  }

  Expected<const uint8_t*> getDynSym() {
    if(DynSymHdr)
      return ElfFile.base() + DynSymHdr->sh_offset;
    return getDynamicData(DynEnt.DynSymAddr, "dynamic symbol table");
  }

  // StringRef or uint8_t* return according to taste as to where the casts go.
  Expected<StringRef> getDynStr() {
    if(DynSymHdr)
      return ElfFile.getStringTableForSymtab(DynSymHdr, Shdrs);
    Expected<const uint8_t *> DataOrErr = getDynamicData(DynEnt.DynSymAddr, "dynamic string table");
    if (!DataOrErr)
      return DataOrErr.takeError();
    return StringRef(*DataOrErr, DynEnt.StrSize);
  }

private:
  DynSym(const ELFFile<ELFT> &ElfFile, const DynamicEntries &DynEnt, Elf_Shdr_Range Shdrs)
    : ElfFile(ElfFile), DynEnt(DynEnt), Shdrs(Shdrs), DynSymHdr(findDynSymHdr(Shdrs)) {}

  Elf_Shdr *findDynSymHdr() {
    // Probably use find_if?
    for (const Elf_Shdr &Sec : Shdrs)
      if (Sec.sh_type == SHT_DYNSYM)
        return &Sec;
    return nullptr;
  }

  Expected<const uint8_t *> getDynamicData(uint64_t Address, StringRef Name) {
      Expected<const uint8_t *> SecPtr = ElfFile.toMappedAddr(EntAddr);
      if (!SecPtr)
        return appendToError(SecPtr.takeError(), "when locating " + Name + " section contents");
      return *SecPtr;
  }

  const ELFFile<ELFT> &ElfFile;
  const DynamicEntries &DynEnt;
  Elf_Shdr_Range Shdrs;
  Elf_Shdr *DynSymHdr;
};
```


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D116769/new/

https://reviews.llvm.org/D116769



More information about the llvm-commits mailing list