[llvm] r188878 - MC: Refactor ObjectSymbolizer to make relocation/section info generation lazy.

David Blaikie dblaikie at gmail.com
Wed Aug 21 10:28:09 PDT 2013


On Wed, Aug 21, 2013 at 12:28 AM, Ahmed Bougacha
<ahmed.bougacha at gmail.com> wrote:
> Author: ab
> Date: Wed Aug 21 02:28:07 2013
> New Revision: 188878
>
> URL: http://llvm.org/viewvc/llvm-project?rev=188878&view=rev
> Log:
> MC: Refactor ObjectSymbolizer to make relocation/section info generation lazy.

Conversely, I'd actually have considered splitting this into a few
patches (since each is a "no functionality change intended" so far as
I can tell, so they're usually easier to review - moreso when it's
obvious what the mechanical operations were). For example, it /looks/
like the body of the ctor moved into the buildRelocationByAddrMap
function, but since the code moved I can't easily tell if they're
identical. A simple transformation from:

foo::foo() {
 ...
}

to:

foo::foo() {
  func();
}

foo::func() {
  ...
}

makes it easy to eyeball as "oh, you just pulled this out into a
function" - if the functions really need to be reordered, that can be
done separately, or even in the same commit potentially (since I'll
take your word for it if you claim "just moving this code out into a
function"). Then I can think more clearly about what it takes to do
this work lazily (in reality I don't know the MC code nearly well
enough to have an opinion on that, but at least I can gloss over that
patch since I would know I didn't have the context for it).

>
> Modified:
>     llvm/trunk/include/llvm/MC/MCObjectSymbolizer.h
>     llvm/trunk/lib/MC/MCObjectSymbolizer.cpp
>
> Modified: llvm/trunk/include/llvm/MC/MCObjectSymbolizer.h
> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/MC/MCObjectSymbolizer.h?rev=188878&r1=188877&r2=188878&view=diff
> ==============================================================================
> --- llvm/trunk/include/llvm/MC/MCObjectSymbolizer.h (original)
> +++ llvm/trunk/include/llvm/MC/MCObjectSymbolizer.h Wed Aug 21 02:28:07 2013
> @@ -32,22 +32,14 @@ class MCObjectSymbolizer : public MCSymb
>  protected:
>    const object::ObjectFile *Obj;
>
> -  typedef DenseMap<uint64_t, object::RelocationRef> AddrToRelocMap;
> -  typedef std::vector<object::SectionRef> SortedSectionList;
> -  SortedSectionList SortedSections;
> -
>    // Map a load address to the first relocation that applies there. As far as I
>    // know, if there are several relocations at the exact same address, they are
>    // related and the others can be determined from the first that was found in
>    // the relocation table. For instance, on x86-64 mach-o, a SUBTRACTOR
>    // relocation (referencing the minuend symbol) is followed by an UNSIGNED
>    // relocation (referencing the subtrahend symbol).
> -  AddrToRelocMap AddrToReloc;
> -
> -  // Helpers around SortedSections.
> -  SortedSectionList::const_iterator findSectionContaining(uint64_t Addr) const;
> -  void insertSection(object::SectionRef Section);
> -
> +  const object::RelocationRef *findRelocationAt(uint64_t Addr);
> +  const object::SectionRef *findSectionContaining(uint64_t Addr);
>
>    MCObjectSymbolizer(MCContext &Ctx, OwningPtr<MCRelocationInfo> &RelInfo,
>                       const object::ObjectFile *Obj);
> @@ -56,9 +48,9 @@ public:
>    /// \name Overridden MCSymbolizer methods:
>    /// @{
>    bool tryAddingSymbolicOperand(MCInst &MI, raw_ostream &cStream,
> -                                int64_t Value,
> -                                uint64_t Address, bool IsBranch,
> -                                uint64_t Offset, uint64_t InstSize);
> +                                int64_t Value, uint64_t Address,
> +                                bool IsBranch, uint64_t Offset,
> +                                uint64_t InstSize);
>
>    void tryAddingPcLoadReferenceComment(raw_ostream &cStream,
>                                         int64_t Value, uint64_t Address);
> @@ -68,6 +60,15 @@ public:
>    static MCObjectSymbolizer *
>      createObjectSymbolizer(MCContext &Ctx, OwningPtr<MCRelocationInfo> &RelInfo,
>                             const object::ObjectFile *Obj);
> +
> +private:
> +  typedef DenseMap<uint64_t, object::RelocationRef> AddrToRelocMap;
> +  typedef std::vector<object::SectionRef> SortedSectionList;
> +  SortedSectionList SortedSections;
> +  AddrToRelocMap AddrToReloc;
> +
> +  void buildSectionList();
> +  void buildRelocationByAddrMap();
>  };
>
>  }
>
> Modified: llvm/trunk/lib/MC/MCObjectSymbolizer.cpp
> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/MC/MCObjectSymbolizer.cpp?rev=188878&r1=188877&r2=188878&view=diff
> ==============================================================================
> --- llvm/trunk/lib/MC/MCObjectSymbolizer.cpp (original)
> +++ llvm/trunk/lib/MC/MCObjectSymbolizer.cpp Wed Aug 21 02:28:07 2013
> @@ -28,98 +28,51 @@ namespace {
>  class MCMachObjectSymbolizer : public MCObjectSymbolizer {
>  public:
>    MCMachObjectSymbolizer(MCContext &Ctx, OwningPtr<MCRelocationInfo> &RelInfo,
> -                         const object::MachOObjectFile *MachOOF)
> -    : MCObjectSymbolizer(Ctx, RelInfo, MachOOF)
> -  {}
> +                         const MachOObjectFile *MOOF) {}
>
>    void tryAddingPcLoadReferenceComment(raw_ostream &cStream,
> -                                       int64_t Value, uint64_t Address) {
> -    AddrToRelocMap::iterator RI = AddrToReloc.find(Address);
> -    if (RI != AddrToReloc.end()) {
> -      const MCExpr *RelExpr = RelInfo->createExprForRelocation(RI->second);
> -      if (!RelExpr || RelExpr->EvaluateAsAbsolute(Value) == false)
> -        return;
> -    }
> -    uint64_t Addr = Value;
> -    SortedSectionList::const_iterator SI = findSectionContaining(Addr);
> -    if (SI != SortedSections.end()) {
> -      const SectionRef &S = *SI;
> -      StringRef Name; S.getName(Name);
> -      uint64_t SAddr; S.getAddress(SAddr);
> -      if (Name == "__cstring") {
> -        StringRef Contents;
> -        S.getContents(Contents);
> -        Contents = Contents.substr(Addr - SAddr);
> -        cStream << " ## literal pool for: "
> -                << Contents.substr(0, Contents.find_first_of(0));
> -      }
> -    }
> -  }
> +                                       int64_t Value,
> +                                       uint64_t Address) LLVM_OVERRIDE;
>  };
>  } // End unnamed namespace
>
> +
> +void MCMachObjectSymbolizer::
> +tryAddingPcLoadReferenceComment(raw_ostream &cStream, int64_t Value,
> +                                uint64_t Address) {
> +  if (const RelocationRef *R = findRelocationAt(Address)) {
> +    const MCExpr *RelExpr = RelInfo->createExprForRelocation(*R);
> +    if (!RelExpr || RelExpr->EvaluateAsAbsolute(Value) == false)
> +      return;
> +  }
> +  uint64_t Addr = Value;
> +  if (const SectionRef *S = findSectionContaining(Addr)) {
> +    StringRef Name; S->getName(Name);
> +    uint64_t SAddr; S->getAddress(SAddr);
> +    if (Name == "__cstring") {
> +      StringRef Contents;
> +      S->getContents(Contents);
> +      Contents = Contents.substr(Addr - SAddr);
> +      cStream << " ## literal pool for: "
> +              << Contents.substr(0, Contents.find_first_of(0));
> +    }
> +  }
> +}
> +
>  //===- MCObjectSymbolizer -------------------------------------------------===//
>
>  MCObjectSymbolizer::MCObjectSymbolizer(MCContext &Ctx,
>                                         OwningPtr<MCRelocationInfo> &RelInfo,
>                                         const ObjectFile *Obj)
>      : MCSymbolizer(Ctx, RelInfo), Obj(Obj), SortedSections(), AddrToReloc() {
> -  error_code ec;
> -  for (section_iterator SI = Obj->begin_sections(),
> -                        SE = Obj->end_sections();
> -                        SI != SE;
> -                        SI.increment(ec)) {
> -    if (ec) break;
> -
> -    section_iterator RelSecI = SI->getRelocatedSection();
> -    if (RelSecI == Obj->end_sections())
> -      continue;
> -
> -    uint64_t StartAddr; RelSecI->getAddress(StartAddr);
> -    uint64_t Size; RelSecI->getSize(Size);
> -    bool RequiredForExec; RelSecI->isRequiredForExecution(RequiredForExec);
> -    if (RequiredForExec == false || Size == 0)
> -      continue;
> -    insertSection(*SI);
> -    for (relocation_iterator RI = SI->begin_relocations(),
> -                             RE = SI->end_relocations();
> -                             RI != RE;
> -                             RI.increment(ec)) {
> -      if (ec) break;
> -      // FIXME: libObject is inconsistent regarding error handling. The
> -      // overwhelming majority of methods always return object_error::success,
> -      // and assert for simple errors.. Here, ELFObjectFile::getRelocationOffset
> -      // asserts when the file type isn't ET_REL.
> -      // This workaround handles x86-64 elf, the only one that has a relocinfo.
> -      uint64_t Offset;
> -      if (Obj->isELF()) {
> -        const ELF64LEObjectFile *ELFObj = dyn_cast<ELF64LEObjectFile>(Obj);
> -        if (ELFObj == 0)
> -          break;
> -        if (ELFObj->getELFFile()->getHeader()->e_type == ELF::ET_REL) {
> -          RI->getOffset(Offset);
> -          Offset += StartAddr;
> -        } else {
> -          RI->getAddress(Offset);
> -        }
> -      } else {
> -        RI->getOffset(Offset);
> -        Offset += StartAddr;
> -      }
> -      // At a specific address, only keep the first relocation.
> -      if (AddrToReloc.find(Offset) == AddrToReloc.end())
> -        AddrToReloc[Offset] = *RI;
> -    }
> -  }
>  }
>
>  bool MCObjectSymbolizer::
>  tryAddingSymbolicOperand(MCInst &MI, raw_ostream &cStream,
>                           int64_t Value, uint64_t Address, bool IsBranch,
>                           uint64_t Offset, uint64_t InstSize) {
> -  AddrToRelocMap::iterator RI = AddrToReloc.find(Address + Offset);
> -  if (RI != AddrToReloc.end()) {
> -    if (const MCExpr *RelExpr = RelInfo->createExprForRelocation(RI->second)) {
> +  if (const RelocationRef *R = findRelocationAt(Address + Offset)) {
> +    if (const MCExpr *RelExpr = RelInfo->createExprForRelocation(*R)) {
>        MI.addOperand(MCOperand::CreateExpr(RelExpr));
>        return true;
>      }
> @@ -133,10 +86,8 @@ tryAddingSymbolicOperand(MCInst &MI, raw
>    uint64_t UValue = Value;
>    // FIXME: map instead of looping each time?
>    error_code ec;
> -  for (symbol_iterator SI = Obj->begin_symbols(),
> -       SE = Obj->end_symbols();
> -       SI != SE;
> -       SI.increment(ec)) {
> +  for (symbol_iterator SI = Obj->begin_symbols(), SE = Obj->end_symbols();
> +       SI != SE; SI.increment(ec)) {
>      if (ec) break;
>      uint64_t SymAddr; SI->getAddress(SymAddr);
>      uint64_t SymSize; SI->getSize(SymSize);
> @@ -166,13 +117,16 @@ tryAddingPcLoadReferenceComment(raw_ostr
>                                  int64_t Value, uint64_t Address) {
>  }
>
> +StringRef MCObjectSymbolizer::findExternalFunctionAt(uint64_t Addr) {
> +  return StringRef();
> +}
> +
>  MCObjectSymbolizer *
>  MCObjectSymbolizer::createObjectSymbolizer(MCContext &Ctx,
>                                             OwningPtr<MCRelocationInfo> &RelInfo,
>                                             const ObjectFile *Obj) {
> -  if (const MachOObjectFile *MachOOF = dyn_cast<MachOObjectFile>(Obj)) {
> -    return new MCMachObjectSymbolizer(Ctx, RelInfo, MachOOF);
> -  }
> +  if (const MachOObjectFile *MOOF = dyn_cast<MachOObjectFile>(Obj))
> +    return new MCMachObjectSymbolizer(Ctx, RelInfo, MOOF);
>    return new MCObjectSymbolizer(Ctx, RelInfo, Obj);
>  }
>
> @@ -183,32 +137,100 @@ static bool SectionStartsBefore(const Se
>    return SAddr < Addr;
>  }
>
> -MCObjectSymbolizer::SortedSectionList::const_iterator
> -MCObjectSymbolizer::findSectionContaining(uint64_t Addr) const {
> -  SortedSectionList::const_iterator
> +const SectionRef *MCObjectSymbolizer::findSectionContaining(uint64_t Addr) {
> +  if (SortedSections.empty())
> +    buildSectionList();
> +
> +  SortedSectionList::iterator
>      EndIt = SortedSections.end(),
>      It = std::lower_bound(SortedSections.begin(), EndIt,
>                            Addr, SectionStartsBefore);
>    if (It == EndIt)
> -    return It;
> +    return 0;
>    uint64_t SAddr; It->getAddress(SAddr);
>    uint64_t SSize; It->getSize(SSize);
>    if (Addr >= SAddr + SSize)
> -    return EndIt;
> -  return It;
> +    return 0;
> +  return &*It;
> +}
> +
> +const RelocationRef *MCObjectSymbolizer::findRelocationAt(uint64_t Addr) {
> +  if (AddrToReloc.empty())
> +    buildRelocationByAddrMap();
> +
> +  AddrToRelocMap::const_iterator RI = AddrToReloc.find(Addr);
> +  if (RI == AddrToReloc.end())
> +    return 0;
> +  return &RI->second;
> +}
> +
> +void MCObjectSymbolizer::buildSectionList() {
> +  error_code ec;
> +  for (section_iterator SI = Obj->begin_sections(), SE = Obj->end_sections();
> +                        SI != SE; SI.increment(ec)) {
> +    if (ec) break;
> +
> +    bool RequiredForExec; SI->isRequiredForExecution(RequiredForExec);
> +    if (RequiredForExec == false)
> +      continue;
> +    uint64_t SAddr; SI->getAddress(SAddr);
> +    uint64_t SSize; SI->getSize(SSize);
> +    SortedSectionList::iterator It = std::lower_bound(SortedSections.begin(),
> +                                                      SortedSections.end(),
> +                                                      SAddr,
> +                                                      SectionStartsBefore);
> +    if (It != SortedSections.end()) {
> +      uint64_t FoundSAddr; It->getAddress(FoundSAddr);
> +      if (FoundSAddr < SAddr + SSize)
> +        llvm_unreachable("Inserting overlapping sections");
> +    }
> +    SortedSections.insert(It, *SI);
> +  }
>  }
>
> -void MCObjectSymbolizer::insertSection(SectionRef Sec) {
> -  uint64_t SAddr; Sec.getAddress(SAddr);
> -  uint64_t SSize; Sec.getSize(SSize);
> -  SortedSectionList::iterator It = std::lower_bound(SortedSections.begin(),
> -                                                    SortedSections.end(),
> -                                                    SAddr,
> -                                                    SectionStartsBefore);
> -  if (It != SortedSections.end()) {
> -    uint64_t FoundSAddr; It->getAddress(FoundSAddr);
> -    if (FoundSAddr < SAddr + SSize)
> -      llvm_unreachable("Inserting overlapping sections");
> +void MCObjectSymbolizer::buildRelocationByAddrMap() {
> +  error_code ec;
> +  for (section_iterator SI = Obj->begin_sections(), SE = Obj->end_sections();
> +                        SI != SE; SI.increment(ec)) {
> +    if (ec) break;
> +
> +    section_iterator RelSecI = SI->getRelocatedSection();
> +    if (RelSecI == Obj->end_sections())
> +      continue;
> +
> +    uint64_t StartAddr; RelSecI->getAddress(StartAddr);
> +    uint64_t Size; RelSecI->getSize(Size);
> +    bool RequiredForExec; RelSecI->isRequiredForExecution(RequiredForExec);
> +    if (RequiredForExec == false || Size == 0)
> +      continue;
> +    for (relocation_iterator RI = SI->begin_relocations(),
> +                             RE = SI->end_relocations();
> +                             RI != RE;
> +                             RI.increment(ec)) {
> +      if (ec) break;
> +      // FIXME: libObject is inconsistent regarding error handling. The
> +      // overwhelming majority of methods always return object_error::success,
> +      // and assert for simple errors.. Here, ELFObjectFile::getRelocationOffset
> +      // asserts when the file type isn't ET_REL.
> +      // This workaround handles x86-64 elf, the only one that has a relocinfo.
> +      uint64_t Offset;
> +      if (Obj->isELF()) {
> +        const ELF64LEObjectFile *ELFObj = dyn_cast<ELF64LEObjectFile>(Obj);
> +        if (ELFObj == 0)
> +          break;
> +        if (ELFObj->getELFFile()->getHeader()->e_type == ELF::ET_REL) {
> +          RI->getOffset(Offset);
> +          Offset += StartAddr;
> +        } else {
> +          RI->getAddress(Offset);
> +        }
> +      } else {
> +        RI->getOffset(Offset);
> +        Offset += StartAddr;
> +      }
> +      // At a specific address, only keep the first relocation.
> +      if (AddrToReloc.find(Offset) == AddrToReloc.end())
> +        AddrToReloc[Offset] = *RI;
> +    }
>    }
> -  SortedSections.insert(It, Sec);
>  }
>
>
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits



More information about the llvm-commits mailing list