[llvm-commits] ELFReader.cpp update - Add support for References.

Michael Spencer bigcheesegs at gmail.com
Fri Sep 7 14:45:16 PDT 2012


On Fri, Sep 7, 2012 at 8:41 AM, Sid Manning <sidneym at codeaurora.org> wrote:
>
> My comments are below and updated patches are attached.
>
> One important new change is that I'm passing the References list to
> DefinedAtom so derefIterator can access the it.
>
>
> Thanks,
>
>
> On 09/01/12 00:04, Michael Spencer wrote:
>>
>> On Fri, Aug 31, 2012 at 12:40 PM, Sid Manning<sidneym at codeaurora.org>
>> wrote:
>
>
> ...
>
>
>>> +// Relocation References: Defined Atom may contain
>>> +// references that will need to be patched before
>>> +// the executable is written.
>>> +//
>>> +class ELFReference : public Reference {
>>> +public:
>>> +  ELFReference(Reference::Kind K,
>>> +               uint64_t O,
>>> +               const Atom *T,
>>> +               uint64_t N,
>>> +               Reference::Addend A)
>>> +    : Target(T)
>>> +    , TargetNameOffset(N)
>>> +    , OffsetInAtom(O)
>>> +    , Addend(A)
>>> +    , Kind(K) { }
>>
>>
>> Why doesn't this just take a tagged union of Elf_Rel_Impl* (for Rel and
>> Rela)
>> and grab the info from that on a call?
>
>
> Turned to a template, passing Elf_Rel(a) to
>
> ...
>
>
>>> +private:
>>> +  const Atom*  Target;
>>> +  uint64_t     TargetNameOffset;
>>
>>
>> TargetNameOffset is a weird way to handle mapping from ELF_R_SYM to Atom*.
>> I
>> think the best way to do this would be to add a public
>> Elf_Sym *getSymbol(uint32_t Index)
>> function to ELFObjectFile. Then add a DenseMap<ElfSym*, Atom*>  to go from
>> ELF_R_SYM to Atom*.
>>
> DenseMap added and getSymbol(uint32_t Index) has been added to ELF.h.
> Passing ELF_Rel(a) to constructor now.
>
> ...
>
>
>>> +  struct NameAtomPair {
>>> +                 NameAtomPair(StringRef N,
>>> +                              Atom *A)
>>> +                   : name(N), atom(A) {}
>>> +    StringRef name;
>>> +    Atom *atom;
>>> +  };
>>> +
>>
>>
>> Use std::pair.
>
> Done, sorta.  Its been removed because due to the use of
> llvm::DenseMap<const Elf_Sym *, Atom *>
>
> ...
>
>
>>> +      if (section->sh_type == llvm::ELF::SHT_RELA)
>>> +      {
>>
>>
>> No new line before {
>
> Done.
>
> ...
>
>
>>> +        sectionName = sectionName.drop_front(5);
>>> +        for (unsigned int i=0; i<contents.size()/sizeof(Elf_Rela); i++)
>>> {
>>
>>
>> This is incorrect. The distance between relocations is defined by
>> Elf_Shdr::st_entsize. This type of iteration should probably be added to
>> ELFObjectFile.
>
> Done, using section->getEntityCount().
>
>
>>
>>> +          RelocationAddendReferences[sectionName].push_back(relocs+i);
>>
>>
>> A reference to RelocationAddendReferences[sectionName] should be
>> hoisted outside of the loop.
>>
> Done.
>
> ...
>
>
>>> -        AbsoluteAtoms._atoms.push_back(
>>> -                             new (AtomStorage.Allocate<ELFAbsoluteAtom>
>>> ())
>>> -                             ELFAbsoluteAtom(*this, SymbolName,
>>> -                                             Symbol->st_value));
>>> +        ELFAbsoluteAtom<target_endianness, is64Bits>  *NewAtom = new
>>> +        ELFAbsoluteAtom<target_endianness, is64Bits>  (*this,
>>> +                                                      SymbolName,
>>> +                                                      Symbol->st_value);
>>
>>
>> Why is this not using placement new in AtomStorage? Also, you can use auto
>> here.
>> Same for the rest of these.
>
> This was mistake I made when I reworked the code, fixed.
>
>
>
>>
>>> +        // Make Elf_Rela references
>>> +        typename std::vector<Elf_Rela *>::iterator rai  =
>>> +                      RelocationAddendReferences[SectionName].begin();
>>> +        typename std::vector<Elf_Rela *>::iterator rae =
>>> +                      RelocationAddendReferences[SectionName].end();
>>> +
>>> +        // Only relocations that are inside the domain of the atom are
>>> +        // added.
>>> +        for (; rai != rae; rai++) {
>>
>>
>> This can use a c++11 for loop.
>
> Using "for (auto& rai : RelocationAddendReferences[SectionName])"
>
>
>>
>>> +          if (((*rai)->r_offset>= (*si)->st_value)&&
>>> +              ((*rai)->r_offset<  (*si)->st_value+contentSize)) {
>>> +
>>> +            ELFReference *eref = new ELFReference (
>>> +                (*rai)->getType(), (*rai)->r_offset-(*si)->st_value,
>>> +                nullptr, (*rai)->getSymbol(), (*rai)->r_addend);
>>
>>
>> This should use a BumpPtrAllocator too. Currently it's just leaked.
>
>
> Added ReaderStorage to store both References and Atoms.
>
>
>>
>> - Michael Spencer
>
>
>
> --
> Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by
> The Linux Foundation

> Index: lib/ReaderWriter/ELF/ReaderELF.cpp
> ===================================================================
> --- lib/ReaderWriter/ELF/ReaderELF.cpp	(revision 162996)
> +++ lib/ReaderWriter/ELF/ReaderELF.cpp	(working copy)
> @@ -14,8 +14,10 @@
>
>  #include "lld/ReaderWriter/ReaderELF.h"
>  #include "lld/Core/File.h"
> +#include "lld/Core/Reference.h"
>
>  #include "llvm/ADT/ArrayRef.h"
> +#include "llvm/ADT/SmallString.h"
>  #include "llvm/ADT/StringRef.h"
>  #include "llvm/Object/ELF.h"
>  #include "llvm/Object/ObjectFile.h"
> @@ -39,7 +41,78 @@
>
>  namespace { // anonymous
>
> +
> +//
> +// Relocation References: Defined Atom may contain
> +// references that will need to be patched before
> +// the executable is written.
> +//

This should be a doxygen style comment.

> +template <llvm::support::endianness target_endianness, bool is64Bits>
> +class ELFReference : public Reference {
> +
> +  typedef llvm::object::Elf_Rel_Impl
> +                        <target_endianness, is64Bits, false> Elf_Rel;
> +  typedef llvm::object::Elf_Rel_Impl
> +                        <target_endianness, is64Bits, true> Elf_Rela;
> +public:
> +
> +  ELFReference(Elf_Rela *R, uint64_t O, const Atom *T)
> +    : Target(T)
> +    , TargetSymbolIndex(R->getSymbol())
> +    , OffsetInAtom(O)
> +    , Addend(R->r_addend)
> +    , Kind(R->getType()) {}
> +
> +  ELFReference(Elf_Rel *R, uint64_t O, const Atom *T)
> +    : Target(T)
> +    , TargetSymbolIndex(R->getSymbol())
> +    , OffsetInAtom(O)
> +    , Addend(0)
> +    , Kind(R->getType()) {}
> +
> +
> +  virtual uint64_t offsetInAtom() const {
> +    return OffsetInAtom;
> +  }
> +
> +  virtual Kind kind() const {
> +    return Kind;
> +  }
> +
> +  virtual void setKind(Kind k) {

Capitalized K.

> +    Kind = k;
> +  }
> +
> +  virtual const Atom* target() const {
> +    return Target;
> +  }
> +
> +  virtual uint64_t targetSymbolIndex() const {

It isn't obvious what this is used for. Doxygen comment.

> +    return TargetSymbolIndex;
> +  }
> +
> +  virtual Addend addend() const {
> +    return Addend;
> +  }
> +
> +  virtual void setAddend(Addend a) {

Capitalized A.

> +    Addend = a;
> +  }
> +
> +  virtual void setTarget(const Atom* newAtom) {

Capitalized NewAtom.

> +    Target = newAtom;
> +  }
> +private:
> +  const Atom*  Target;

* on left side.

> +  uint64_t     TargetSymbolIndex;
> +  uint64_t     OffsetInAtom;
> +  Addend       Addend;
> +  Kind         Kind;
> +};
> +
> +
>  // This atom class corresponds to absolute symbol

Doxygen comment.

> +template<llvm::support::endianness target_endianness, bool is64Bits>
>  class ELFAbsoluteAtom: public AbsoluteAtom {
>
>  public:
> @@ -126,15 +199,23 @@
>                   llvm::StringRef SN,
>                   const Elf_Sym *E,
>                   const Elf_Shdr *S,
> -                 llvm::ArrayRef<uint8_t> D)
> +                 llvm::ArrayRef<uint8_t> D,
> +                 unsigned int RS,
> +                 unsigned int RE,
> +                 std::vector<ELFReference<target_endianness, is64Bits> *> *R)

It would be nice if these had better names. Also this constructor should have
a doxygen comment explaining each param.

Also, R should be a
const std::vector<ELFReference<target_endianness, is64Bits> *> &. And
should probably be typedefed to a shorter name.

> +
>      : OwningFile(F)
>      , SymbolName(N)
>      , SectionName(SN)
>      , Symbol(E)
>      , Section(S)
> -    , ContentData(D) {
> +    , ContentData(D)
> +    , ReferenceStartIndex(RS)
> +    , ReferenceEndIndex(RE)
> +    , ReferenceList(R) {
>      static uint64_t ordernumber = 0;
> -    _ordinal = ++ordernumber;
> +    Ordinal = ++ordernumber;
> +
>    }
>
>    virtual const class File &file() const {
> @@ -146,7 +227,7 @@
>    }
>
>    virtual uint64_t ordinal() const {
> -    return _ordinal;
> +    return Ordinal;
>    }
>
>    virtual uint64_t size() const {
> @@ -284,21 +365,33 @@
>      return ContentData;
>    }
>
> -  virtual reference_iterator begin() const {
> -    return reference_iterator(*this, nullptr);
> +  DefinedAtom::reference_iterator begin() const {
> +    uintptr_t index = ReferenceStartIndex;
> +    const void* it = reinterpret_cast<const void*>(index);
> +    return reference_iterator(*this, it);
>    }
>
> -  virtual reference_iterator end() const {
> -    return reference_iterator(*this, nullptr);
> +  DefinedAtom::reference_iterator end() const {
> +    uintptr_t index = ReferenceEndIndex;
> +    const void* it = reinterpret_cast<const void*>(index);
> +    return reference_iterator(*this, it);
>    }
>
> -private:
> -  virtual const Reference *derefIterator(const void *iter) const {
> -    return nullptr;
> +  const Reference* derefIterator(const void* it) const {

It

> +    uintptr_t index = reinterpret_cast<uintptr_t>(it);
> +    assert(index >= ReferenceStartIndex);
> +    assert(index < ReferenceEndIndex);
> +    return ((*ReferenceList)[index]);
>    }
> -  virtual void incrementIterator(const void *&iter) const {
> +
> +  void incrementIterator(const void*& it) const {

It

> +    uintptr_t index = reinterpret_cast<uintptr_t>(it);
> +    ++index;
> +    it = reinterpret_cast<const void*>(index);
>    }
>
> +private:
> +
>    const File &OwningFile;
>    llvm::StringRef SymbolName;
>    llvm::StringRef SectionName;
> @@ -308,7 +401,10 @@
>    // ContentData will hold the bits that make up the atom.
>    llvm::ArrayRef<uint8_t> ContentData;
>
> -  uint64_t _ordinal;
> +  uint64_t Ordinal;
> +  unsigned int ReferenceStartIndex;
> +  unsigned int ReferenceEndIndex;
> +  std::vector<ELFReference<target_endianness, is64Bits> *> *ReferenceList;

Change to match constructor argument type above.

>  };
>
>
> @@ -318,8 +414,14 @@
>  template<llvm::support::endianness target_endianness, bool is64Bits>
>  class FileELF: public File {
>
> -  typedef llvm::object::Elf_Sym_Impl<target_endianness, is64Bits> Elf_Sym;
> -  typedef llvm::object::Elf_Shdr_Impl<target_endianness, is64Bits> Elf_Shdr;
> +  typedef llvm::object::Elf_Sym_Impl
> +                        <target_endianness, is64Bits> Elf_Sym;
> +  typedef llvm::object::Elf_Shdr_Impl
> +                        <target_endianness, is64Bits> Elf_Shdr;
> +  typedef llvm::object::Elf_Rel_Impl
> +                        <target_endianness, is64Bits, false> Elf_Rel;
> +  typedef llvm::object::Elf_Rel_Impl
> +                        <target_endianness, is64Bits, true> Elf_Rela;
>
>  public:
>    FileELF(std::unique_ptr<llvm::MemoryBuffer> MB, llvm::error_code &EC) :
> @@ -343,19 +445,80 @@
>
>      std::map< const Elf_Shdr *, std::vector<const Elf_Sym *>> SectionSymbols;
>
> +//  Handle: SHT_REL and SHT_RELA sections:
> +//  Increment over the sections, when REL/RELA section types are
> +//  found add the contents to the RelocationReferences map.
> +
> +    llvm::object::section_iterator sit(Obj->begin_sections());
> +    llvm::object::section_iterator sie(Obj->end_sections());

SIT, SIE

> +    for (; sit != sie; sit.increment(EC)) {
> +      if (EC)
> +        return;
> +
> +      const Elf_Shdr *section = Obj->getElfSection(sit);

Section

> +
> +      if (section->sh_type == llvm::ELF::SHT_RELA) {
> +        StringRef contents;

Contents

> +        if ((EC = Obj->getSectionContents(section, contents)))
> +          return;
> +
> +        llvm::StringRef sectionName;

SectionName

> +        if ((EC = Obj->getSectionName(section, sectionName)))
> +          return;
> +
> +        Elf_Rela *relocs = const_cast<Elf_Rela *>

Relocs. Also, there really shouldn't be a const cast here. We should never have
a pointer to non-const raw data while reading.

> +                    (reinterpret_cast<const Elf_Rela *>(contents.data()));
> +
> +        // Get rid of the leading .rela so Atoms can use their own section
> +        // name to find the relocs.
> +        sectionName = sectionName.drop_front(5);
> +
> +        auto &ref = RelocationAddendReferences[sectionName];

Ref.

> +        for (unsigned int i=0; i<section->getEntityCount(); i++) {
> +          ref.push_back(relocs+i);

This is still wrong. The offset to use for each relocation entry is based on
sh_entsize. See the getRel and getEntry templates in Object/ELF.h. We can expose
these, or better yet, proper iterators to them. I dislike duplicating the work
of file format parsing.

> +        }
> +
> +      }
> +
> +      if (section->sh_type == llvm::ELF::SHT_REL)
> +      {

No new line before {

> +        StringRef contents;

Contents.

> +        if ((EC = Obj->getSectionContents(section, contents)))
> +          return;
> +
> +        llvm::StringRef sectionName;

SectionName.

> +        if ((EC = Obj->getSectionName(section, sectionName)))
> +          return;
> +
> +        Elf_Rel *relocs = const_cast<Elf_Rel *>
> +                    (reinterpret_cast<const Elf_Rel *>(contents.data()));

Same as above.

> +
> +        // Get rid of the leading .rel so Atoms can use their own section
> +        // name to find the relocs.
> +        sectionName = sectionName.drop_front(4);
> +
> +        auto &ref = RelocationReferences[sectionName];
> +        for (unsigned int i=0; i<section->getEntityCount(); i++) {
> +          ref.push_back(relocs+i);

Same as above.

> +        }
> +
> +      }
> +    }
> +
> +//  Increment over all the symbols collecting atoms and symbol
> +//  names for later use.
> +
>      llvm::object::symbol_iterator it(Obj->begin_symbols());
>      llvm::object::symbol_iterator ie(Obj->end_symbols());
>
>      for (; it != ie; it.increment(EC)) {
>        if (EC)
>          return;
> -      llvm::object::SectionRef SR;
> -      llvm::object::section_iterator section(SR);
>
> -      if ((EC = it->getSection(section)))
> +      if ((EC = it->getSection(sit)))
>          return;
>
> -      const Elf_Shdr *Section = Obj->getElfSection(section);
> +      const Elf_Shdr *Section = Obj->getElfSection(sit);
>        const Elf_Sym  *Symbol  = Obj->getElfSymbol(it);
>
>        llvm::StringRef SymbolName;
> @@ -364,18 +527,24 @@
>
>        if (Symbol->st_shndx == llvm::ELF::SHN_ABS) {
>          // Create an absolute atom.
> -        AbsoluteAtoms._atoms.push_back(
> -                             new (AtomStorage.Allocate<ELFAbsoluteAtom> ())
> -                             ELFAbsoluteAtom(*this, SymbolName,
> -                                             Symbol->st_value));
> +        auto *NewAtom = new (ReaderStorage.Allocate
> +                       <ELFAbsoluteAtom<target_endianness, is64Bits> > ())
> +                        ELFAbsoluteAtom<target_endianness, is64Bits>
> +                          (*this, SymbolName, Symbol->st_value);
>
> +        AbsoluteAtoms._atoms.push_back(NewAtom);
> +        SymbolToAtomMapping.insert(std::make_pair(Symbol, NewAtom));
> +
>        } else if (Symbol->st_shndx == llvm::ELF::SHN_UNDEF) {
>          // Create an undefined atom.
> -        UndefinedAtoms._atoms.push_back(
> -            new (AtomStorage.Allocate<ELFUndefinedAtom<
> -                 target_endianness, is64Bits>>())
> -                 ELFUndefinedAtom<target_endianness, is64Bits> (
> -                                 *this, SymbolName, Symbol));
> +        auto *NewAtom = new (ReaderStorage.Allocate
> +                       <ELFUndefinedAtom<target_endianness, is64Bits> > ())
> +                        ELFUndefinedAtom<target_endianness, is64Bits>
> +                          (*this, SymbolName, Symbol);
> +
> +        UndefinedAtoms._atoms.push_back(NewAtom);
> +        SymbolToAtomMapping.insert(std::make_pair(Symbol, NewAtom));
> +
>        } else {
>          // This is actually a defined symbol. Add it to its section's list of
>          // symbols.
> @@ -427,33 +596,84 @@
>
>          bool IsCommon = false;
>          if (((*si)->getType() == llvm::ELF::STT_COMMON)
> -             || (*si)->st_shndx == llvm::ELF::SHN_COMMON)
> +          || (*si)->st_shndx == llvm::ELF::SHN_COMMON)
>            IsCommon = true;
>
>          // Get the symbol's content:
>          llvm::ArrayRef<uint8_t> SymbolData;
> +        uint64_t contentSize;

ContentSize.

>          if (si + 1 == se) {
>            // if this is the last symbol, take up the remaining data.
> -          SymbolData = llvm::ArrayRef<uint8_t>((uint8_t *)symbolContents.data()
> -                                    + (*si)->st_value,
> -                                    (IsCommon) ? 0 :
> -                                    ((i.first)->sh_size - (*si)->st_value));
> +          contentSize = (IsCommon) ? 0
> +                                   : ((i.first)->sh_size - (*si)->st_value);
>          }
>          else {
> -          SymbolData = llvm::ArrayRef<uint8_t>((uint8_t *)symbolContents.data()
> -                                    + (*si)->st_value,
> -                                    (IsCommon) ? 0 :
> -                                    (*(si + 1))->st_value - (*si)->st_value);
> +          contentSize = (IsCommon) ? 0
> +                                   : (*(si + 1))->st_value - (*si)->st_value;
>          }
>
> -        DefinedAtoms._atoms.push_back(
> -          new (AtomStorage.Allocate<ELFDefinedAtom<
> -               target_endianness, is64Bits> > ())
> -               ELFDefinedAtom<target_endianness, is64Bits> (*this,
> -                             SymbolName, SectionName,
> -                             *si, i.first, SymbolData));
> +        SymbolData = llvm::ArrayRef<uint8_t>((uint8_t *)symbolContents.data()
> +                                    + (*si)->st_value, contentSize);
> +
> +
> +        unsigned int referenceStart = References.size();

ReferencesStart.

> +
> +        // Only relocations that are inside the domain of the atom are
> +        // added.
> +
> +        // Add Rela (those with r_addend) references:
> +        for (auto& rai : RelocationAddendReferences[SectionName]) {

RAI.

> +          if ((rai->r_offset >= (*si)->st_value) &&
> +              (rai->r_offset < (*si)->st_value+contentSize)) {
> +
> +            auto *eref = new (ReaderStorage.Allocate

ERef.

> +                         <ELFReference<target_endianness, is64Bits> > ())
> +                          ELFReference<target_endianness, is64Bits> (
> +                          rai, rai->r_offset-(*si)->st_value, nullptr);
> +
> +            References.push_back(eref);
> +          }
> +        }
> +
> +        // Add Rel references:
> +        for (auto& ri : RelocationReferences[SectionName]) {

auto &RI.

> +          if (((ri)->r_offset >= (*si)->st_value) &&
> +              ((ri)->r_offset < (*si)->st_value+contentSize)) {
> +
> +            auto *eref = new (ReaderStorage.Allocate

ERef.

> +                         <ELFReference<target_endianness, is64Bits> > ())
> +                          ELFReference<target_endianness, is64Bits> (
> +                         (ri), (ri)->r_offset-(*si)->st_value, nullptr);
> +
> +            References.push_back(eref);
> +          }
> +        }
> +
> +        // Create the DefinedAtom and add it to the list of DefinedAtoms.
> +        auto *NewAtom = new (ReaderStorage.Allocate
> +                       <ELFDefinedAtom<target_endianness, is64Bits> > ())
> +                        ELFDefinedAtom<target_endianness, is64Bits>
> +                           (*this, SymbolName, SectionName,
> +                             *si, i.first, SymbolData,
> +                             referenceStart, References.size(), &References);
> +
> +        DefinedAtoms._atoms.push_back(NewAtom);
> +        SymbolToAtomMapping.insert(std::make_pair((*si), NewAtom));
> +
>        }
>      }
> +
> +// All the Atoms and References are created.  Now update each Reference's
> +// target with the Atom pointer it refers to.
> +    typename std::vector<ELFReference<target_endianness, is64Bits> *>::iterator
> +                                     ri  = References.begin();
> +    typename std::vector<ELFReference<target_endianness, is64Bits> *>::iterator
> +                                     eri = References.end();
> +    for (; ri != eri; ri++) {

C++11 for.

> +      const Elf_Sym  *Symbol  = Obj->getElfSymbol((*ri)->targetSymbolIndex());
> +      Atom *target = findAtom (Symbol);

Target. Although you don't need a temporary for this. Also findAtom(Symbol).

> +      (*ri)->setTarget(target);
> +    }
>    }
>
>    virtual void addAtom(const Atom&) {
> @@ -476,6 +696,15 @@
>      return AbsoluteAtoms;
>    }
>
> +  Atom *findAtom(const Elf_Sym  *symbol) {
> +    for (auto &ci : SymbolToAtomMapping) {
> +      if (ci.first == symbol)
> +        return ci.second;
> +    }

This should use DenseMap::find.

> +    return nullptr;
> +  }
> +
> +
>  private:
>    std::unique_ptr<llvm::object::ELFObjectFile<target_endianness, is64Bits> >
>        Obj;
> @@ -483,8 +712,18 @@
>    atom_collection_vector<UndefinedAtom>     UndefinedAtoms;
>    atom_collection_vector<SharedLibraryAtom> SharedLibraryAtoms;
>    atom_collection_vector<AbsoluteAtom>      AbsoluteAtoms;
> -  llvm::BumpPtrAllocator AtomStorage;
>
> +// This contains a list of relocations references.  In ELF if a
> +// section named, ".text" that has relocations will also have
> +// a section named ".rel.text" or ".rela.text" which will hold the
> +// entries. -- .rel or .rela is prepended to create the SHT_REL(A) section.

Doxygen comments.

> +  std::map<llvm::StringRef, std::vector<Elf_Rela *>> RelocationAddendReferences;
> +  std::map<llvm::StringRef, std::vector<Elf_Rel *>> RelocationReferences;
> +
> +  std::vector<ELFReference<target_endianness, is64Bits> *>References;
> +  llvm::DenseMap<const Elf_Sym *, Atom *>SymbolToAtomMapping;

Space after the >.

> +
> +  llvm::BumpPtrAllocator ReaderStorage;
>  };
>
>  //  ReaderELF is reader object that will instantiate correct FileELF
>

Other patch looks fine.

- Michael Spencer



More information about the llvm-commits mailing list