[PATCH] D116769: [ifs] Allow llvm-ifs to generate text stub from elf stub

Haowei Wu via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Fri Jan 14 15:13:11 PST 2022


haowei added inline comments.


================
Comment at: llvm/lib/InterfaceStub/ELFObjHandler.cpp:549
+  }
+  if (Shdrs.get().size()) {
+    for (const Elf_Shdr &Sec : *Shdrs) {
----------------
jhenderson wrote:
> haowei wrote:
> > jhenderson wrote:
> > > jhenderson wrote:
> > > > Noting that most of this function is almost identical to above, so probably you want to have a function that finds a dynsym section and returns it, so that these two functions can work on them as appropriate.
> > > How about having a function that takes a lambda "OnFound", to share the majority of the logic between these two functions (that may or may not then be sensible to add to the ElfFile interface). Approximate untested implementation and usage:
> > > 
> > > ```
> > > template <class ELFT>
> > > Expected<const uint8_t*> getSectionFromDynsym(const ElfFile<ELFT> &ElfFile, ELFT::Elf_Addr DynAddr, StringRef Name, function_ref<Expected<const uint8_t *>(const ELFT::Shdr &, const ELFT::ShdrRange &) OnFound) {
> > >     using Elf_Shdr_Range = typename ELFT::ShdrRange;
> > >   using Elf_Shdr = typename ELFT::Shdr;
> > > 
> > >   Expected<Elf_Shdr_Range> Shdrs = ElfFile.sections();
> > >   if (!Shdrs)
> > >     return Shdrs.takeError();
> > > 
> > >   if (Shdrs.get().size()) {
> > >     for (const Elf_Shdr &Sec : *Shdrs) {
> > >       if (Sec.sh_type == ELF::SHT_DYNSYM) {
> > >         // If multiple .dynsym are present, use the first one.
> > >         // This behavior aligns with llvm::object::ELFFile::getDynSymtabSize().
> > >         return OnFound(Sec, *Shdrs);
> > >       }
> > >     }
> > >   }  // Fallback to use offsets from .dynamic as .dynsym was not found.
> > >   Expected<const uint8_t *> PtrOrErr =
> > >       ElfFile.toMappedAddr(DynAddr);
> > >   if (!PtrOrErr)
> > >     return appendToError(PtrOrErr.takeError(),
> > >                          "when locating " + Name + " section contents");
> > >   return *PtrOrErr;
> > > }
> > > 
> > > template <class ELFT>
> > > static Expected<const uint8_t *> getDynSymPtr(const ELFFile<ELFT> &ElfFile,
> > >                                               const DynamicEntries &DynEnt) {
> > >   return getSectionFromDynsym(ElfFile, DynEnt.DynSymAddr, "dynamic symbol table", [&ElfFile](const ELFT::Elf_Shdr &Sec, const ELFT::ShdrRange &Shdrs){ return ElfFile.base() + Sec.sh_offset(); });
> > > }
> > > 
> > > template <class ELFT>
> > > static Expected<StringRef> getDynStr(const ELFFile<ELFT> &ElfFile,
> > >                                               const DynamicEntries &DynEnt) {
> > >   Expected<StringRef> ResultOrErr = getSectionFromDynsym(ElfFile, DynEnt.DynStrAddr, "dynamic string table", [&ElfFile](const ELFT::Elf_Shdr &Sec, const ELFT::ShdrRange &Shdrs){ return ElfFile.getStringTableForSymtab(Sec, *Shdrs); });
> > >   if (!ResultOrErr)
> > >     return ResultOrErr;
> > >   return StringRef(reinterpret_cast<const char *>(*ResultOrErr), DynEnt.StrSize);
> > > }
> > > ```
> > > I've also made some more idiomatic changes to the names and usage of the `Expected` at the end, which you should probably adopt anyway (specifically using "*OrErr" for the variable name, and using `operator*` to get the real value). Also, as a related aside, I'd avoid using explicitly ".dynsym" and ".dynstr" as names if you're looking up by type, as there is a chance these sections aren't named that. My example code should illustrate it better, hopefully.
> > Thanks for providing this code snippet, I modified it and finished a version that passed all the test: https://gist.github.com/zeroomega/9552161efca45ad8af17a3f312582753
> > 
> > Unfortunately I don't think it can achieve the goal we want. IMO the lambda solution is a bit less readable. And it is about 5lines longer if you strip the comments. I think the cause is that getDynStr and getDynSym has 2 different type of return values. The first one returns a StringRef while the other one returns a raw pointer. The API `getStringTableForSymtab` used in getDynStr has the same return type. If both function used a common helper function, the runtion will need to return raw pointer, and that is where all the conversion needed. If you look at [L50](https://gist.github.com/zeroomega/9552161efca45ad8af17a3f312582753#file-lambda_solution-cpp-L50), I have to convert StringRef back to raw pointer to accommodate that. And I have to check the `Expect<T>` before the conversion, otherwise I am getting `Expected<T> must be checked before access or destruction.` error.
> > 
> > 
> My main goal is removing the near-identical code between these two functions, as duplicated code will lead to more maintenance problems down the line. I have two ideas to resolve this:
> 
> 1) We could solve this issue by changing the signature of `getDynSymPtr` to return an `Expected<StringRef>`. A cast could be added to convert within the lambda, if needed, but it doesn't make it significantly more complex. The call site would need a small tweak too, but it could happen after the `Expected` has been checked (a quick look shows there's already a cast, so it just would require addition of a `.data()` call somewhere when doing the existing cast).
> 
> 2) Another idea I had would be to somehow template the return type. I believe this isn't an option using plain old functions, but you could wrap it in a templated class, I believe, and make the function a member of that class.
> 
> I'm not sure which would look cleaner overall, but both should work, and resolve the duplicated code.
For suggestion 1. That is a hard no for me. DynSym data are binary I don't think it is wise to fill it into a StringRef which is suppose to just host references to strings.  For suggestion 2. I would like to avoid use template class type for return value. It does not help the readability and makes the problem complicated. Instead, I implemented a solution in Diff 9, in which I reuse the code as best as I can.  Please take a look.

I am still lean to Diff 8 since it is more straight forward and easier to understand. But if you are insist on no code should be repeated, I am fine with the solution in Diff 9.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D116769/new/

https://reviews.llvm.org/D116769



More information about the llvm-commits mailing list