[llvm-commits] RuntimeDyld RelocationResolver

Jim Grosbach grosbach at apple.com
Thu Sep 29 11:25:49 PDT 2011


On Sep 23, 2011, at 8:13 AM, Danil Malyshev wrote:

> Hello everyone,
>  
>  
> Please find attached the patch for review.
> The patch adds a new RelocationResolver class which will get used later in the re-factored RuntimeDyLd.
> I’ll send the patch for re-factored RuntimeDyLd once this one will pass the review.

Hi Danil,

Some comments:

> Index: lib/ExecutionEngine/RuntimeDyld/RelocationResolver.cpp
> ===================================================================
> --- lib/ExecutionEngine/RuntimeDyld/RelocationResolver.cpp	(revision 0)
> +++ lib/ExecutionEngine/RuntimeDyld/RelocationResolver.cpp	(revision 0)
> @@ -0,0 +1,198 @@
> +#include "RelocationResolver.h"
> +#include "llvm/Support/ErrorHandling.h"
> +
> +using namespace llvm;
> +
> +namespace llvm {
> +
> +RelocationResolver::~RelocationResolver() {}
> +RelocationResolverARM::~RelocationResolverARM() {}
> +RelocationResolverX86::~RelocationResolverX86() {}
> +

Target names for classes like this are generally done as a prefix, not a suffix. It would be better to do that here for consistency.


> +RelocationResolver* RelocationResolver::get(MachineType Type) {
> +  switch (Type) {
> +  default:
> +    report_fatal_error("Unsupported machine type!");
> +  case MT_ARM :
> +    return new RelocationResolverARM();
> +  case MT_X86 :
> +    return new RelocationResolverX86();
> +  }

x86 vs. x86_64?

> +}
> +
> +unsigned RelocationResolver::getMaxStubSize() {
> +    return 0;
> +}

This seems odd? Placeholder? If so, a FIXME to that effect would be good.

> +
> +// ------------------------------------------------------ RelocationResolverARM

LLVM doesn't generally use right-justified comments like this. Not a big deal, but it looks a bit odd in context.

Eventually, the target-specific bits will all be moved into the actual Target directories. That's OK to leave for another day, though. For now, though, it'd be good to split the target bits into separate files rather than keeping it all together in one. That'll make it easier to move things out when the time comes. A bit outside the scope of this patch since there's pre-existing code there that isn't split out that way, but it would be a lot better to do now rather than later.

> +void RelocationResolverARM::
> +resolveRelocation(uint8_t *Address, uintptr_t Value,
> +                  RelocationType Type, intptr_t Addend) {
> +  // TODO: Add Thumb relocations.
> +  uint32_t* RelPtr = (uint32_t*)Address;
> +

Be careful here. The MCJIT supports compiling for a remote target, which may not be the same arch, so assumptions about the size of pointers and such may not work out. At first reading, there's nothing obviously incorrect in that regard here, but it's tricky, so worth mentioning to keep an eye out for it.


> +  switch(Type) {
> +  default:
> +    report_fatal_error("Not implemented relocation type!");
> +
> +  case reloc_arm_absolute :
> +    *RelPtr = Value;
> +    break;
> +
> +  case reloc_arm_movw :
> +    Value = Value & 0xFFFF;
> +    *RelPtr |= Value & 0xFFF;
> +    *RelPtr |= ((Value >> 12) & 0xF) << 16;
> +    break;
> +
> +  case reloc_arm_movt :
> +    Value = (Value >> 16) & 0xFFFF;
> +    *RelPtr |= Value & 0xFFF;
> +    *RelPtr |= ((Value >> 12) & 0xF) << 16;
> +    break;
> +
> +  case reloc_arm_branch :
> +    int32_t RelValue = (int32_t)Value - (int32_t)Address - 8;
> +    RelValue = (RelValue & 0x03FFFFFC) >> 2;
> +    *RelPtr &= 0xFF000000;
> +    *RelPtr |= RelValue;
> +    break;
> +  }
> +}
> +
> +uint8_t* RelocationResolverARM::createStubFunction(uint8_t *Addr) {
> +  // TODO: Add Thumb stub.
> +  uint32_t *stubAddr = (uint32_t*)Addr;
> +  *stubAddr = ARMStubInstructionCode;
> +  return (uint8_t*)++stubAddr;
> +}

This seems odd. At minimum, copious comments are needed explaining what's going on here.

> +
> +void RelocationResolverARM::
> +processRelocations(const RawRelocationList &RawRelocations,
> +                   const NamedAddresses &Symbols,
> +                   uint8_t *StubBuffer, RelocationMap &Relocations) {
> +
> +  StringMap<uint8_t*> stubs;
> +
> +  RawRelocationList::const_iterator relIt = RawRelocations.begin();
> +  RawRelocationList::const_iterator relItEnd = RawRelocations.end();
> +  for ( ;relIt != relItEnd; ++relIt) {
> +    // Get modify address.
> +    NamedAddresses::const_iterator symIt = Symbols.find(relIt->SymbolName);

We shouldn't need to do by-name lookup here. Mapping names should have been handled by the loader when the symbol table was processed. Everything at this level should be able to use symbol table indices.

> +    if (symIt == Symbols.end()) {
> +      report_fatal_error("Symbol for relocation has not been found.");
> +    }

No braces when there's just one statement.

> +    uint8_t *addr = symIt->getValue() + relIt->SymbolOffset;
> +
> +    // Try to find target address.
> +    symIt = Symbols.find(relIt->TargetName);
> +    if (symIt != Symbols.end()) {
> +      // Target symbol found, so just resolve immediate.
> +      resolveRelocation(addr, (uintptr_t)symIt->getValue(),
> +                        relIt->Type, relIt->Addend);
> +    } else {
> +      // This is an external symbol. We donít have the address yet.
> +      // Add it to the RelocationTable for later processing.
> +      if (relIt->Type == reloc_arm_branch) {
> +        // This is a branch relocation, need to use a stub function.
> +
> +        //  Look up for existing stub.
> +        StringMap<uint8_t*>::iterator stubIt = stubs.find(relIt->TargetName);
> +        if (stubIt != stubs.end()) {
> +          resolveRelocation(addr, (uintptr_t)stubIt->getValue(),
> +                            relIt->Type, relIt->Addend);
> +        } else {
> +          // Create a new stub function.

Not all relocations are on functions. How are global values handled, for example?

> +          resolveRelocation(addr, (uintptr_t)StubBuffer,
> +                            relIt->Type, relIt->Addend);
> +          stubs[relIt->TargetName] = StubBuffer;
> +          uint8_t *stubAddr = createStubFunction(StubBuffer);
> +          Relocations[relIt->TargetName].push_back(
> +            RelocationEntry(stubAddr,
> +                            reloc_arm_absolute,
> +                            0));
> +          StubBuffer += getMaxStubSize();
> +        }
> +      } else {
> +        Relocations[relIt->TargetName].push_back(
> +          RelocationEntry(addr,
> +                          relIt->Type,
> +                          relIt->Addend));
> +      }
> +    }
> +  }
> +}
> +
> +unsigned RelocationResolverARM::getMaxStubSize() {
> +  return sizeof(void*)*2;
> +}
> +
> +
> +// ------------------------------------------------------ RelocationResolverX86
> +void RelocationResolverX86::
> +resolveRelocation(uint8_t *Address, uintptr_t Value,
> +                  RelocationType Type, intptr_t Addend) {
> +  uint32_t* relPtr32 = (uint32_t*)Address;
> +  uint64_t* relPtr64 = (uint64_t*)Address;
> +  intptr_t relValue = (intptr_t)Value + (intptr_t)Address + Addend;
> +
> +  switch(Type) {
> +  default:
> +    report_fatal_error("Not implemented relocation type!");
> +
> +  case reloc_x86_absolute_32 :
> +    *relPtr32 = (uint32_t)Value + (uint32_t)Addend;
> +    break;
> +
> +  case reloc_x86_absolute_64 :
> +    *relPtr64 = (uint64_t)Value + (uint64_t)Addend;
> +    break;
> +
> +  case reloc_x86_branch_32 :
> +    *relPtr32 = (int32_t)relValue;
> +    break;
> +
> +  case reloc_x86_branch_64 :
> +    *relPtr64 = (int64_t)relValue;
> +    break;
> +  }
> +}
> +
> +void RelocationResolverX86::
> +processRelocations(const RawRelocationList &RawRelocations,
> +                   const NamedAddresses &Symbols,
> +                   uint8_t *StubBuffer, RelocationMap &Relocations) {
> +
> +  RawRelocationList::const_iterator relIt = RawRelocations.begin();
> +  RawRelocationList::const_iterator relItEnd = RawRelocations.end();
> +  for (;relIt != relItEnd; ++relIt) {
> +    // Get modify address.
> +    NamedAddresses::const_iterator symIt = Symbols.find(relIt->SymbolName);
> +    if (symIt == Symbols.end()) {
> +      report_fatal_error("Symbol for relocation has not been found.");
> +    }
> +    uint8_t *addr = symIt->getValue() + relIt->SymbolOffset;
> +
> +    // Try to find target address.
> +    symIt = Symbols.find(relIt->TargetName);
> +    if (symIt != Symbols.end()) {
> +      // Target symbol found, so just resolve immediate.
> +      resolveRelocation(addr, (uintptr_t)symIt->getValue(),
> +                        relIt->Type, relIt->Addend);
> +    } else {
> +      // This is an external symbol. We donít have the address yet.
> +      // Add it to the RelocationTable for later processing.
> +      Relocations[relIt->TargetName].push_back(
> +        RelocationEntry(addr,
> +                        relIt->Type,
> +                        relIt->Addend));
> +    }
> +  }
> +}
> +
> +unsigned RelocationResolverX86::getMaxStubSize() {
> +  return 0;
> +}
> +
> +
> +} // end namespace llvm
> Index: lib/ExecutionEngine/RuntimeDyld/RelocationResolver.h
> ===================================================================
> --- lib/ExecutionEngine/RuntimeDyld/RelocationResolver.h	(revision 0)
> +++ lib/ExecutionEngine/RuntimeDyld/RelocationResolver.h	(revision 0)
> @@ -0,0 +1,94 @@
> +//===-- RelocationResolver.h - Run-time dynamic linker for MC-JIT ------*- C++ -*-===//
> +//
> +//                     The LLVM Compiler Infrastructure
> +//
> +// This file is distributed under the University of Illinois Open Source
> +// License. See LICENSE.TXT for details.
> +//
> +//===----------------------------------------------------------------------===//
> +//
> +// Machine specific relocation resolver.
> +//
> +//===----------------------------------------------------------------------===//
> +
> +
> +#ifndef LLVM_RUNTIME_DYLD_RELOCATION_RESOLVER_H
> +#define LLVM_RUNTIME_DYLD_RELOCATION_RESOLVER_H
> +
> +#include "RuntimeDyldTypes.h"
> +
> +namespace llvm {
> +
> +class RelocationResolver {
> +public:
> +
> +  virtual ~RelocationResolver();
> +
> +  /// \brief Creates a new relocation resolver for the selected machine type.
> +  static RelocationResolver* get(MachineType Type);
> +
> +  /// \brief A machine specific relocation resolver
> +  /// \param Address Address to apply the relocation action
> +  /// \param Value Target symbol address to apply the relocation action
> +  /// \param Type Machine specific relocation type
> +  /// \param Addend A constant addend used to compute the value to be stored
> +  ///        into the relocatable field
> +  virtual void resolveRelocation(uint8_t *Address, uintptr_t Value,
> +                                 RelocationType Type,
> +                                 intptr_t Addend) = 0;
> +
> +  /// \brief Returns a maximum size of a function stub in bytes, or 0 for
> +  ///        platforms without stubs.
> +  virtual unsigned getMaxStubSize();
> +
> +  /// \brief Processes the relocations. Each relocation gets checked if it could
> +  ///        be resolved, gets resolved if so, or gets added to the Relocations
> +  ///        list for later processing.
> +  /// \param RawRelocations List with raw information about relocations
> +  /// \param NamedAddresses Addresses of named symbols from current object file.
> +  /// \param StubBuffer Pointer to buffer for emmitting stub functions, for
> +  ///        selected relocation, if necessory.
> +  /// \param Relocations Result map with relocations whose target symbol
> +  ///        addresses unknown.
> +  virtual void processRelocations(const RawRelocationList &RawRelocations,
> +                                  const NamedAddresses &Symbols,
> +                                  uint8_t *StubBuffer,
> +                                  RelocationMap &Relocations) = 0;
> +};
> +
> +class RelocationResolverARM: public RelocationResolver {
> +private:
> +  static const uint32_t ARMStubInstructionCode = 0xe51ff004; // ldr pc,<label>
> +public:
> +  ~RelocationResolverARM();
> +  void resolveRelocation(uint8_t *Address, uintptr_t Value,
> +                         RelocationType Type, intptr_t Addend);
> +
> +  /// \brief Emits long jump instruction to Addr.
> +  /// \return Pointer to the memory area for emitting target address.
> +  uint8_t* createStubFunction(uint8_t *Addr);
> +
> +  unsigned getMaxStubSize();
> +
> +  void processRelocations(const RawRelocationList &RawRelocations,
> +                          const NamedAddresses &Symbols, uint8_t *StubBuffer,
> +                          RelocationMap &Relocations);
> +};
> +
> +class RelocationResolverX86: public RelocationResolver {
> +public:
> +  ~RelocationResolverX86();
> +  void resolveRelocation(uint8_t *Address, uintptr_t Value,
> +                         RelocationType Type, intptr_t Addend);
> +
> +  unsigned getMaxStubSize();
> +
> +  void processRelocations(const RawRelocationList &RawRelocations,
> +                          const NamedAddresses &Symbols, uint8_t *StubBuffer,
> +                          RelocationMap &Relocations);
> +};
> +
> +} // end namespace llvm
> +
> +
> +#endif
> Index: lib/ExecutionEngine/RuntimeDyld/RuntimeDyldTypes.h
> ===================================================================
> --- lib/ExecutionEngine/RuntimeDyld/RuntimeDyldTypes.h	(revision 0)
> +++ lib/ExecutionEngine/RuntimeDyld/RuntimeDyldTypes.h	(revision 0)
> @@ -0,0 +1,95 @@
> +//===-- RuntimeDyldTypes.h - Run-time dynamic linker for MC-JIT ------*- C++ -*-===//
> +//
> +//                     The LLVM Compiler Infrastructure
> +//
> +// This file is distributed under the University of Illinois Open Source
> +// License. See LICENSE.TXT for details.
> +//
> +//===----------------------------------------------------------------------===//
> +//
> +// Types used by runtime dynamic linker facilities.
> +//
> +//===----------------------------------------------------------------------===//
> +
> +#ifndef LLVM_RUNTIME_DYLD_TYPES_H
> +#define LLVM_RUNTIME_DYLD_TYPES_H
> +
> +#include "llvm/ADT/StringMap.h"
> +#include "llvm/ADT/SmallVector.h"
> +#include <set>
> +
> +using namespace llvm;
> +
> +namespace llvm {
> +
> +enum MachineType {
> +  MT_ARM,
> +  MT_X86
> +};
> +
> +
> +struct SymbolEntry {
> +  const char* SymbolName;      // Symbol name

As above, the name shouldn't matter here. We should have a master symbol table and use indices into that as object identifiers.

The RuntimeDyld already has a symbol table. Why is this bit not using/enhancing that?

> +  bool IsGlobal;               // If true, this is a global symbol which could
> +                               // be used in other object files.
> +  const uint8_t *ObjStartAddr; // Begin of memory block with function instructions
> +  const uint8_t *ObjEndAddr;   // End of memory block with function instructions

Is this target memory? The local memory where our copy is stored? In any case, addr/length is probably better than start/end so we can handle symbols w/o any data associated (aliases).

> +
> +  SymbolEntry(const char *name, bool isGlobal,
> +              const uint8_t *startAddr, const uint8_t *endAddr)
> +    : SymbolName(name), IsGlobal(isGlobal), ObjStartAddr(startAddr),
> +      ObjEndAddr(endAddr) {}
> +};
> +typedef SmallVector<SymbolEntry, 4> SymbolList;
> +
> +typedef std::set<const char*> ExtSymbols;
> +typedef StringMap<uint8_t*> NamedAddresses;
> +
> +enum RelocationType {
> +  reloc_none,
> +  reloc_arm_absolute,
> +  reloc_arm_branch,
> +  reloc_arm_movt,
> +  reloc_arm_movw,
> +
> +  reloc_x86_absolute_32,
> +  reloc_x86_absolute_64,
> +  reloc_x86_branch_32,
> +  reloc_x86_branch_64
> +};

Relocations are inherently target and platform specific. ARM MachO vs. ARM ELF are very different, for example. This appears to be trying for a one-size-fits-all solution. I don't think that's going to work out. This is the biggest concern I have with this patch. The relocations should use the definitions in the Object and ELF/MachO file format headers and handle them distinctly.

> +
> +struct RelocationEntry {
> +  uint8_t*    Address;
> +  RelocationType Type;
> +  intptr_t    Addend;
> +
> +  RelocationEntry(uint8_t* address,
> +                  RelocationType relType,
> +                  intptr_t addend)
> +    : Address(address), Type(relType), Addend(addend) {}
> +};
> +typedef SmallVector<RelocationEntry, 4> RelocationList;
> +typedef StringMap<RelocationList> RelocationMap;
> +
> +struct RawRelocationEntry {
> +  const char* TargetName;
> +  const char* SymbolName;
> +  unsigned    SymbolOffset;
> +  RelocationType Type;
> +  intptr_t    Addend;
> +
> +  RawRelocationEntry(const char* targetName,
> +                     const char* symbolName,
> +                     unsigned symbolOffset,
> +                     RelocationType relType,
> +                     intptr_t addend)
> +    : TargetName(targetName), SymbolName(symbolName),
> +      SymbolOffset(symbolOffset), Type(relType), Addend(addend) {}
> +};
> +typedef SmallVector<RawRelocationEntry, 4> RawRelocationList;
> +
> +
> +} // end namespace llvm
> +
> +
> +#endif
> 




More information about the llvm-commits mailing list