[llvm-commits] ELFReader.cpp update - Add support for References.
Michael Spencer
bigcheesegs at gmail.com
Fri Sep 7 14:45:16 PDT 2012
On Fri, Sep 7, 2012 at 8:41 AM, Sid Manning <sidneym at codeaurora.org> wrote:
>
> My comments are below and updated patches are attached.
>
> One important new change is that I'm passing the References list to
> DefinedAtom so derefIterator can access the it.
>
>
> Thanks,
>
>
> On 09/01/12 00:04, Michael Spencer wrote:
>>
>> On Fri, Aug 31, 2012 at 12:40 PM, Sid Manning<sidneym at codeaurora.org>
>> wrote:
>
>
> ...
>
>
>>> +// Relocation References: Defined Atom may contain
>>> +// references that will need to be patched before
>>> +// the executable is written.
>>> +//
>>> +class ELFReference : public Reference {
>>> +public:
>>> + ELFReference(Reference::Kind K,
>>> + uint64_t O,
>>> + const Atom *T,
>>> + uint64_t N,
>>> + Reference::Addend A)
>>> + : Target(T)
>>> + , TargetNameOffset(N)
>>> + , OffsetInAtom(O)
>>> + , Addend(A)
>>> + , Kind(K) { }
>>
>>
>> Why doesn't this just take a tagged union of Elf_Rel_Impl* (for Rel and
>> Rela)
>> and grab the info from that on a call?
>
>
> Turned to a template, passing Elf_Rel(a) to
>
> ...
>
>
>>> +private:
>>> + const Atom* Target;
>>> + uint64_t TargetNameOffset;
>>
>>
>> TargetNameOffset is a weird way to handle mapping from ELF_R_SYM to Atom*.
>> I
>> think the best way to do this would be to add a public
>> Elf_Sym *getSymbol(uint32_t Index)
>> function to ELFObjectFile. Then add a DenseMap<ElfSym*, Atom*> to go from
>> ELF_R_SYM to Atom*.
>>
> DenseMap added and getSymbol(uint32_t Index) has been added to ELF.h.
> Passing ELF_Rel(a) to constructor now.
>
> ...
>
>
>>> + struct NameAtomPair {
>>> + NameAtomPair(StringRef N,
>>> + Atom *A)
>>> + : name(N), atom(A) {}
>>> + StringRef name;
>>> + Atom *atom;
>>> + };
>>> +
>>
>>
>> Use std::pair.
>
> Done, sorta. Its been removed because due to the use of
> llvm::DenseMap<const Elf_Sym *, Atom *>
>
> ...
>
>
>>> + if (section->sh_type == llvm::ELF::SHT_RELA)
>>> + {
>>
>>
>> No new line before {
>
> Done.
>
> ...
>
>
>>> + sectionName = sectionName.drop_front(5);
>>> + for (unsigned int i=0; i<contents.size()/sizeof(Elf_Rela); i++)
>>> {
>>
>>
>> This is incorrect. The distance between relocations is defined by
>> Elf_Shdr::st_entsize. This type of iteration should probably be added to
>> ELFObjectFile.
>
> Done, using section->getEntityCount().
>
>
>>
>>> + RelocationAddendReferences[sectionName].push_back(relocs+i);
>>
>>
>> A reference to RelocationAddendReferences[sectionName] should be
>> hoisted outside of the loop.
>>
> Done.
>
> ...
>
>
>>> - AbsoluteAtoms._atoms.push_back(
>>> - new (AtomStorage.Allocate<ELFAbsoluteAtom>
>>> ())
>>> - ELFAbsoluteAtom(*this, SymbolName,
>>> - Symbol->st_value));
>>> + ELFAbsoluteAtom<target_endianness, is64Bits> *NewAtom = new
>>> + ELFAbsoluteAtom<target_endianness, is64Bits> (*this,
>>> + SymbolName,
>>> + Symbol->st_value);
>>
>>
>> Why is this not using placement new in AtomStorage? Also, you can use auto
>> here.
>> Same for the rest of these.
>
> This was mistake I made when I reworked the code, fixed.
>
>
>
>>
>>> + // Make Elf_Rela references
>>> + typename std::vector<Elf_Rela *>::iterator rai =
>>> + RelocationAddendReferences[SectionName].begin();
>>> + typename std::vector<Elf_Rela *>::iterator rae =
>>> + RelocationAddendReferences[SectionName].end();
>>> +
>>> + // Only relocations that are inside the domain of the atom are
>>> + // added.
>>> + for (; rai != rae; rai++) {
>>
>>
>> This can use a c++11 for loop.
>
> Using "for (auto& rai : RelocationAddendReferences[SectionName])"
>
>
>>
>>> + if (((*rai)->r_offset>= (*si)->st_value)&&
>>> + ((*rai)->r_offset< (*si)->st_value+contentSize)) {
>>> +
>>> + ELFReference *eref = new ELFReference (
>>> + (*rai)->getType(), (*rai)->r_offset-(*si)->st_value,
>>> + nullptr, (*rai)->getSymbol(), (*rai)->r_addend);
>>
>>
>> This should use a BumpPtrAllocator too. Currently it's just leaked.
>
>
> Added ReaderStorage to store both References and Atoms.
>
>
>>
>> - Michael Spencer
>
>
>
> --
> Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by
> The Linux Foundation
> Index: lib/ReaderWriter/ELF/ReaderELF.cpp
> ===================================================================
> --- lib/ReaderWriter/ELF/ReaderELF.cpp (revision 162996)
> +++ lib/ReaderWriter/ELF/ReaderELF.cpp (working copy)
> @@ -14,8 +14,10 @@
>
> #include "lld/ReaderWriter/ReaderELF.h"
> #include "lld/Core/File.h"
> +#include "lld/Core/Reference.h"
>
> #include "llvm/ADT/ArrayRef.h"
> +#include "llvm/ADT/SmallString.h"
> #include "llvm/ADT/StringRef.h"
> #include "llvm/Object/ELF.h"
> #include "llvm/Object/ObjectFile.h"
> @@ -39,7 +41,78 @@
>
> namespace { // anonymous
>
> +
> +//
> +// Relocation References: Defined Atom may contain
> +// references that will need to be patched before
> +// the executable is written.
> +//
This should be a doxygen style comment.
> +template <llvm::support::endianness target_endianness, bool is64Bits>
> +class ELFReference : public Reference {
> +
> + typedef llvm::object::Elf_Rel_Impl
> + <target_endianness, is64Bits, false> Elf_Rel;
> + typedef llvm::object::Elf_Rel_Impl
> + <target_endianness, is64Bits, true> Elf_Rela;
> +public:
> +
> + ELFReference(Elf_Rela *R, uint64_t O, const Atom *T)
> + : Target(T)
> + , TargetSymbolIndex(R->getSymbol())
> + , OffsetInAtom(O)
> + , Addend(R->r_addend)
> + , Kind(R->getType()) {}
> +
> + ELFReference(Elf_Rel *R, uint64_t O, const Atom *T)
> + : Target(T)
> + , TargetSymbolIndex(R->getSymbol())
> + , OffsetInAtom(O)
> + , Addend(0)
> + , Kind(R->getType()) {}
> +
> +
> + virtual uint64_t offsetInAtom() const {
> + return OffsetInAtom;
> + }
> +
> + virtual Kind kind() const {
> + return Kind;
> + }
> +
> + virtual void setKind(Kind k) {
Capitalized K.
> + Kind = k;
> + }
> +
> + virtual const Atom* target() const {
> + return Target;
> + }
> +
> + virtual uint64_t targetSymbolIndex() const {
It isn't obvious what this is used for. Doxygen comment.
> + return TargetSymbolIndex;
> + }
> +
> + virtual Addend addend() const {
> + return Addend;
> + }
> +
> + virtual void setAddend(Addend a) {
Capitalized A.
> + Addend = a;
> + }
> +
> + virtual void setTarget(const Atom* newAtom) {
Capitalized NewAtom.
> + Target = newAtom;
> + }
> +private:
> + const Atom* Target;
* on left side.
> + uint64_t TargetSymbolIndex;
> + uint64_t OffsetInAtom;
> + Addend Addend;
> + Kind Kind;
> +};
> +
> +
> // This atom class corresponds to absolute symbol
Doxygen comment.
> +template<llvm::support::endianness target_endianness, bool is64Bits>
> class ELFAbsoluteAtom: public AbsoluteAtom {
>
> public:
> @@ -126,15 +199,23 @@
> llvm::StringRef SN,
> const Elf_Sym *E,
> const Elf_Shdr *S,
> - llvm::ArrayRef<uint8_t> D)
> + llvm::ArrayRef<uint8_t> D,
> + unsigned int RS,
> + unsigned int RE,
> + std::vector<ELFReference<target_endianness, is64Bits> *> *R)
It would be nice if these had better names. Also this constructor should have
a doxygen comment explaining each param.
Also, R should be a
const std::vector<ELFReference<target_endianness, is64Bits> *> &. And
should probably be typedefed to a shorter name.
> +
> : OwningFile(F)
> , SymbolName(N)
> , SectionName(SN)
> , Symbol(E)
> , Section(S)
> - , ContentData(D) {
> + , ContentData(D)
> + , ReferenceStartIndex(RS)
> + , ReferenceEndIndex(RE)
> + , ReferenceList(R) {
> static uint64_t ordernumber = 0;
> - _ordinal = ++ordernumber;
> + Ordinal = ++ordernumber;
> +
> }
>
> virtual const class File &file() const {
> @@ -146,7 +227,7 @@
> }
>
> virtual uint64_t ordinal() const {
> - return _ordinal;
> + return Ordinal;
> }
>
> virtual uint64_t size() const {
> @@ -284,21 +365,33 @@
> return ContentData;
> }
>
> - virtual reference_iterator begin() const {
> - return reference_iterator(*this, nullptr);
> + DefinedAtom::reference_iterator begin() const {
> + uintptr_t index = ReferenceStartIndex;
> + const void* it = reinterpret_cast<const void*>(index);
> + return reference_iterator(*this, it);
> }
>
> - virtual reference_iterator end() const {
> - return reference_iterator(*this, nullptr);
> + DefinedAtom::reference_iterator end() const {
> + uintptr_t index = ReferenceEndIndex;
> + const void* it = reinterpret_cast<const void*>(index);
> + return reference_iterator(*this, it);
> }
>
> -private:
> - virtual const Reference *derefIterator(const void *iter) const {
> - return nullptr;
> + const Reference* derefIterator(const void* it) const {
It
> + uintptr_t index = reinterpret_cast<uintptr_t>(it);
> + assert(index >= ReferenceStartIndex);
> + assert(index < ReferenceEndIndex);
> + return ((*ReferenceList)[index]);
> }
> - virtual void incrementIterator(const void *&iter) const {
> +
> + void incrementIterator(const void*& it) const {
It
> + uintptr_t index = reinterpret_cast<uintptr_t>(it);
> + ++index;
> + it = reinterpret_cast<const void*>(index);
> }
>
> +private:
> +
> const File &OwningFile;
> llvm::StringRef SymbolName;
> llvm::StringRef SectionName;
> @@ -308,7 +401,10 @@
> // ContentData will hold the bits that make up the atom.
> llvm::ArrayRef<uint8_t> ContentData;
>
> - uint64_t _ordinal;
> + uint64_t Ordinal;
> + unsigned int ReferenceStartIndex;
> + unsigned int ReferenceEndIndex;
> + std::vector<ELFReference<target_endianness, is64Bits> *> *ReferenceList;
Change to match constructor argument type above.
> };
>
>
> @@ -318,8 +414,14 @@
> template<llvm::support::endianness target_endianness, bool is64Bits>
> class FileELF: public File {
>
> - typedef llvm::object::Elf_Sym_Impl<target_endianness, is64Bits> Elf_Sym;
> - typedef llvm::object::Elf_Shdr_Impl<target_endianness, is64Bits> Elf_Shdr;
> + typedef llvm::object::Elf_Sym_Impl
> + <target_endianness, is64Bits> Elf_Sym;
> + typedef llvm::object::Elf_Shdr_Impl
> + <target_endianness, is64Bits> Elf_Shdr;
> + typedef llvm::object::Elf_Rel_Impl
> + <target_endianness, is64Bits, false> Elf_Rel;
> + typedef llvm::object::Elf_Rel_Impl
> + <target_endianness, is64Bits, true> Elf_Rela;
>
> public:
> FileELF(std::unique_ptr<llvm::MemoryBuffer> MB, llvm::error_code &EC) :
> @@ -343,19 +445,80 @@
>
> std::map< const Elf_Shdr *, std::vector<const Elf_Sym *>> SectionSymbols;
>
> +// Handle: SHT_REL and SHT_RELA sections:
> +// Increment over the sections, when REL/RELA section types are
> +// found add the contents to the RelocationReferences map.
> +
> + llvm::object::section_iterator sit(Obj->begin_sections());
> + llvm::object::section_iterator sie(Obj->end_sections());
SIT, SIE
> + for (; sit != sie; sit.increment(EC)) {
> + if (EC)
> + return;
> +
> + const Elf_Shdr *section = Obj->getElfSection(sit);
Section
> +
> + if (section->sh_type == llvm::ELF::SHT_RELA) {
> + StringRef contents;
Contents
> + if ((EC = Obj->getSectionContents(section, contents)))
> + return;
> +
> + llvm::StringRef sectionName;
SectionName
> + if ((EC = Obj->getSectionName(section, sectionName)))
> + return;
> +
> + Elf_Rela *relocs = const_cast<Elf_Rela *>
Relocs. Also, there really shouldn't be a const cast here. We should never have
a pointer to non-const raw data while reading.
> + (reinterpret_cast<const Elf_Rela *>(contents.data()));
> +
> + // Get rid of the leading .rela so Atoms can use their own section
> + // name to find the relocs.
> + sectionName = sectionName.drop_front(5);
> +
> + auto &ref = RelocationAddendReferences[sectionName];
Ref.
> + for (unsigned int i=0; i<section->getEntityCount(); i++) {
> + ref.push_back(relocs+i);
This is still wrong. The offset to use for each relocation entry is based on
sh_entsize. See the getRel and getEntry templates in Object/ELF.h. We can expose
these, or better yet, proper iterators to them. I dislike duplicating the work
of file format parsing.
> + }
> +
> + }
> +
> + if (section->sh_type == llvm::ELF::SHT_REL)
> + {
No new line before {
> + StringRef contents;
Contents.
> + if ((EC = Obj->getSectionContents(section, contents)))
> + return;
> +
> + llvm::StringRef sectionName;
SectionName.
> + if ((EC = Obj->getSectionName(section, sectionName)))
> + return;
> +
> + Elf_Rel *relocs = const_cast<Elf_Rel *>
> + (reinterpret_cast<const Elf_Rel *>(contents.data()));
Same as above.
> +
> + // Get rid of the leading .rel so Atoms can use their own section
> + // name to find the relocs.
> + sectionName = sectionName.drop_front(4);
> +
> + auto &ref = RelocationReferences[sectionName];
> + for (unsigned int i=0; i<section->getEntityCount(); i++) {
> + ref.push_back(relocs+i);
Same as above.
> + }
> +
> + }
> + }
> +
> +// Increment over all the symbols collecting atoms and symbol
> +// names for later use.
> +
> llvm::object::symbol_iterator it(Obj->begin_symbols());
> llvm::object::symbol_iterator ie(Obj->end_symbols());
>
> for (; it != ie; it.increment(EC)) {
> if (EC)
> return;
> - llvm::object::SectionRef SR;
> - llvm::object::section_iterator section(SR);
>
> - if ((EC = it->getSection(section)))
> + if ((EC = it->getSection(sit)))
> return;
>
> - const Elf_Shdr *Section = Obj->getElfSection(section);
> + const Elf_Shdr *Section = Obj->getElfSection(sit);
> const Elf_Sym *Symbol = Obj->getElfSymbol(it);
>
> llvm::StringRef SymbolName;
> @@ -364,18 +527,24 @@
>
> if (Symbol->st_shndx == llvm::ELF::SHN_ABS) {
> // Create an absolute atom.
> - AbsoluteAtoms._atoms.push_back(
> - new (AtomStorage.Allocate<ELFAbsoluteAtom> ())
> - ELFAbsoluteAtom(*this, SymbolName,
> - Symbol->st_value));
> + auto *NewAtom = new (ReaderStorage.Allocate
> + <ELFAbsoluteAtom<target_endianness, is64Bits> > ())
> + ELFAbsoluteAtom<target_endianness, is64Bits>
> + (*this, SymbolName, Symbol->st_value);
>
> + AbsoluteAtoms._atoms.push_back(NewAtom);
> + SymbolToAtomMapping.insert(std::make_pair(Symbol, NewAtom));
> +
> } else if (Symbol->st_shndx == llvm::ELF::SHN_UNDEF) {
> // Create an undefined atom.
> - UndefinedAtoms._atoms.push_back(
> - new (AtomStorage.Allocate<ELFUndefinedAtom<
> - target_endianness, is64Bits>>())
> - ELFUndefinedAtom<target_endianness, is64Bits> (
> - *this, SymbolName, Symbol));
> + auto *NewAtom = new (ReaderStorage.Allocate
> + <ELFUndefinedAtom<target_endianness, is64Bits> > ())
> + ELFUndefinedAtom<target_endianness, is64Bits>
> + (*this, SymbolName, Symbol);
> +
> + UndefinedAtoms._atoms.push_back(NewAtom);
> + SymbolToAtomMapping.insert(std::make_pair(Symbol, NewAtom));
> +
> } else {
> // This is actually a defined symbol. Add it to its section's list of
> // symbols.
> @@ -427,33 +596,84 @@
>
> bool IsCommon = false;
> if (((*si)->getType() == llvm::ELF::STT_COMMON)
> - || (*si)->st_shndx == llvm::ELF::SHN_COMMON)
> + || (*si)->st_shndx == llvm::ELF::SHN_COMMON)
> IsCommon = true;
>
> // Get the symbol's content:
> llvm::ArrayRef<uint8_t> SymbolData;
> + uint64_t contentSize;
ContentSize.
> if (si + 1 == se) {
> // if this is the last symbol, take up the remaining data.
> - SymbolData = llvm::ArrayRef<uint8_t>((uint8_t *)symbolContents.data()
> - + (*si)->st_value,
> - (IsCommon) ? 0 :
> - ((i.first)->sh_size - (*si)->st_value));
> + contentSize = (IsCommon) ? 0
> + : ((i.first)->sh_size - (*si)->st_value);
> }
> else {
> - SymbolData = llvm::ArrayRef<uint8_t>((uint8_t *)symbolContents.data()
> - + (*si)->st_value,
> - (IsCommon) ? 0 :
> - (*(si + 1))->st_value - (*si)->st_value);
> + contentSize = (IsCommon) ? 0
> + : (*(si + 1))->st_value - (*si)->st_value;
> }
>
> - DefinedAtoms._atoms.push_back(
> - new (AtomStorage.Allocate<ELFDefinedAtom<
> - target_endianness, is64Bits> > ())
> - ELFDefinedAtom<target_endianness, is64Bits> (*this,
> - SymbolName, SectionName,
> - *si, i.first, SymbolData));
> + SymbolData = llvm::ArrayRef<uint8_t>((uint8_t *)symbolContents.data()
> + + (*si)->st_value, contentSize);
> +
> +
> + unsigned int referenceStart = References.size();
ReferencesStart.
> +
> + // Only relocations that are inside the domain of the atom are
> + // added.
> +
> + // Add Rela (those with r_addend) references:
> + for (auto& rai : RelocationAddendReferences[SectionName]) {
RAI.
> + if ((rai->r_offset >= (*si)->st_value) &&
> + (rai->r_offset < (*si)->st_value+contentSize)) {
> +
> + auto *eref = new (ReaderStorage.Allocate
ERef.
> + <ELFReference<target_endianness, is64Bits> > ())
> + ELFReference<target_endianness, is64Bits> (
> + rai, rai->r_offset-(*si)->st_value, nullptr);
> +
> + References.push_back(eref);
> + }
> + }
> +
> + // Add Rel references:
> + for (auto& ri : RelocationReferences[SectionName]) {
auto &RI.
> + if (((ri)->r_offset >= (*si)->st_value) &&
> + ((ri)->r_offset < (*si)->st_value+contentSize)) {
> +
> + auto *eref = new (ReaderStorage.Allocate
ERef.
> + <ELFReference<target_endianness, is64Bits> > ())
> + ELFReference<target_endianness, is64Bits> (
> + (ri), (ri)->r_offset-(*si)->st_value, nullptr);
> +
> + References.push_back(eref);
> + }
> + }
> +
> + // Create the DefinedAtom and add it to the list of DefinedAtoms.
> + auto *NewAtom = new (ReaderStorage.Allocate
> + <ELFDefinedAtom<target_endianness, is64Bits> > ())
> + ELFDefinedAtom<target_endianness, is64Bits>
> + (*this, SymbolName, SectionName,
> + *si, i.first, SymbolData,
> + referenceStart, References.size(), &References);
> +
> + DefinedAtoms._atoms.push_back(NewAtom);
> + SymbolToAtomMapping.insert(std::make_pair((*si), NewAtom));
> +
> }
> }
> +
> +// All the Atoms and References are created. Now update each Reference's
> +// target with the Atom pointer it refers to.
> + typename std::vector<ELFReference<target_endianness, is64Bits> *>::iterator
> + ri = References.begin();
> + typename std::vector<ELFReference<target_endianness, is64Bits> *>::iterator
> + eri = References.end();
> + for (; ri != eri; ri++) {
C++11 for.
> + const Elf_Sym *Symbol = Obj->getElfSymbol((*ri)->targetSymbolIndex());
> + Atom *target = findAtom (Symbol);
Target. Although you don't need a temporary for this. Also findAtom(Symbol).
> + (*ri)->setTarget(target);
> + }
> }
>
> virtual void addAtom(const Atom&) {
> @@ -476,6 +696,15 @@
> return AbsoluteAtoms;
> }
>
> + Atom *findAtom(const Elf_Sym *symbol) {
> + for (auto &ci : SymbolToAtomMapping) {
> + if (ci.first == symbol)
> + return ci.second;
> + }
This should use DenseMap::find.
> + return nullptr;
> + }
> +
> +
> private:
> std::unique_ptr<llvm::object::ELFObjectFile<target_endianness, is64Bits> >
> Obj;
> @@ -483,8 +712,18 @@
> atom_collection_vector<UndefinedAtom> UndefinedAtoms;
> atom_collection_vector<SharedLibraryAtom> SharedLibraryAtoms;
> atom_collection_vector<AbsoluteAtom> AbsoluteAtoms;
> - llvm::BumpPtrAllocator AtomStorage;
>
> +// This contains a list of relocations references. In ELF if a
> +// section named, ".text" that has relocations will also have
> +// a section named ".rel.text" or ".rela.text" which will hold the
> +// entries. -- .rel or .rela is prepended to create the SHT_REL(A) section.
Doxygen comments.
> + std::map<llvm::StringRef, std::vector<Elf_Rela *>> RelocationAddendReferences;
> + std::map<llvm::StringRef, std::vector<Elf_Rel *>> RelocationReferences;
> +
> + std::vector<ELFReference<target_endianness, is64Bits> *>References;
> + llvm::DenseMap<const Elf_Sym *, Atom *>SymbolToAtomMapping;
Space after the >.
> +
> + llvm::BumpPtrAllocator ReaderStorage;
> };
>
> // ReaderELF is reader object that will instantiate correct FileELF
>
Other patch looks fine.
- Michael Spencer
More information about the llvm-commits
mailing list