[LLVMdev] [PATCH] basic reading reloc visitor for x86_64 ELF

Eric Christopher echristo at gmail.com
Tue Nov 6 08:38:49 PST 2012


That's what I get for moving the patch between machines. Attached here
along with one of the comments that Eli wanted.

-eric


On Tue, Nov 6, 2012 at 12:00 AM, Michael Spencer <bigcheesegs at gmail.com>wrote:

> On Mon, Nov 5, 2012 at 5:17 PM, Eric Christopher <echristo at gmail.com>
> wrote:
> > For llvm-dwarfdump we need to handle relocations inside the debug info
> > sections in order to successfully dump the dwarf info including strings.
> > Nick sent out a partial patch that did this not too long ago and I've
> taken
> > it and gone in a bit of a different direction, but kept the same basic
> > architecture.
> >
> > In place of applying the relocations to the data we've read from disk I'm
> > keeping a separate mapping table to the side and checking that at
> locations
> > in the dwarf I'm expecting relocated values. This adds a bit of
> complexity
> > to the dwarf parsing/extraction at the benefit of not allocating memory
> for
> > the entire size of the debug info section.
> >
> > Couple of areas that will need to be improved later:
> >
> > a) Relocations in more than a single section: the .debug_info section is
> the
> > primary one I cared about first, however, we'll need either
> >   1) A better mapping that contains section + address (since the debug
> > sections are mapped at address 0 I can't just use total offset)
> >   2) More mappings per section we're disassembling
> >
> > I'm likely to go with #2 rather than #1, but I'm open to any rationale
> > either direction.
> >
> > b) Symbol relocations for function sections and/or functions as well as
> > hooking it into, e.g. the aranges disassembly.
> >
> > I've got plans to add these things as I go along, but since it was now
> > pretty usable for testing/dumping I wanted to get it in and then
> incremental
> > on top of it.
> >
> > Thoughts?
> >
> > -eric
> >
> > _______________________________________________
> > LLVM Developers mailing list
> > LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
> >
>
> This seems to be missing RelocVisitor.h
>
> - Michael Spencer
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20121106/77a0f946/attachment.html>
-------------- next part --------------
commit a35a9c732fd8838523561698bf63b9c17758983e
Author: Eric Christopher <echristo at gmail.com>
Date:   Tue Nov 6 08:37:24 2012 -0800

    First stab at relocation visitor for object.

diff --git a/include/llvm/DebugInfo/DIContext.h b/include/llvm/DebugInfo/DIContext.h
index 8d6054a..2e34bac 100644
--- a/include/llvm/DebugInfo/DIContext.h
+++ b/include/llvm/DebugInfo/DIContext.h
@@ -15,6 +15,7 @@
 #ifndef LLVM_DEBUGINFO_DICONTEXT_H
 #define LLVM_DEBUGINFO_DICONTEXT_H
 
+#include "llvm/ADT/DenseMap.h"
 #include "llvm/ADT/SmallVector.h"
 #include "llvm/ADT/SmallString.h"
 #include "llvm/ADT/StringRef.h"
@@ -89,6 +90,13 @@ public:
   }
 };
 
+// In place of applying the relocations to the data we've read from disk we use
+// a separate mapping table to the side and checking that at locations in the dwarf
+// we expec relocated values. This adds a bit of complexity to the dwarf
+// parsing/extraction at the benefit of not allocating memory for the entire
+// size of the debug info sections.
+typedef DenseMap<uint64_t, std::pair<uint8_t, int64_t> > RelocAddrMap;
+
 class DIContext {
 public:
   virtual ~DIContext();
@@ -100,7 +108,8 @@ public:
                                     StringRef aRangeSection = StringRef(),
                                     StringRef lineSection = StringRef(),
                                     StringRef stringSection = StringRef(),
-                                    StringRef rangeSection = StringRef());
+                                    StringRef rangeSection = StringRef(),
+                                    const RelocAddrMap &Map = RelocAddrMap());
 
   virtual void dump(raw_ostream &OS) = 0;
 
diff --git a/include/llvm/Object/RelocVisitor.h b/include/llvm/Object/RelocVisitor.h
new file mode 100644
index 0000000..7668bde
--- /dev/null
+++ b/include/llvm/Object/RelocVisitor.h
@@ -0,0 +1,131 @@
+//===-- RelocVisitor.h - Visitor for object file relocations -*- C++ -*-===//
+//
+//                     The LLVM Compiler Infrastructure
+//
+// This file is distributed under the University of Illinois Open Source
+// License. See LICENSE.TXT for details.
+//
+//===----------------------------------------------------------------------===//
+//
+// This file provides a wrapper around all the different types of relocations
+// in different file formats, such that a client can handle them in a unified
+// manner by only implementing a minimal number of functions.
+//
+//===----------------------------------------------------------------------===//
+
+#ifndef _LLVM_OBJECT_RELOCVISITOR
+#define _LLVM_OBJECT_RELOCVISITOR
+
+#include "llvm/Support/Debug.h"
+#include "llvm/Support/raw_ostream.h"
+#include "llvm/Object/ObjectFile.h"
+#include "llvm/Object/ELF.h"
+#include "llvm/ADT/StringRef.h"
+
+namespace llvm {
+namespace object {
+
+struct RelocToApply {
+  // The computed value after applying the relevant relocations.
+  int64_t Value;
+
+  // The width of the value; how many bytes to touch when applying the
+  // relocation.
+  char Width;
+  RelocToApply(const RelocToApply &In) : Value(In.Value), Width(In.Width) {}
+  RelocToApply(int64_t Value, char Width) : Value(Value), Width(Width) {}
+  RelocToApply() : Value(0), Width(0) {}
+};
+
+/// @brief Base class for object file relocation visitors.
+class RelocVisitor {
+public:
+  explicit RelocVisitor(llvm::StringRef FileFormat)
+    : FileFormat(FileFormat), HasError(false) {}
+
+  // TODO: Should handle multiple applied relocations via either passing in the
+  // previously computed value or just count paired relocations as a single
+  // visit.
+  RelocToApply visit(uint32_t RelocType, RelocationRef R, uint64_t SecAddr = 0,
+                     uint64_t Value = 0) {
+    if (FileFormat == "ELF64-x86-64") {
+      switch (RelocType) {
+        case llvm::ELF::R_X86_64_NONE:
+          return visitELF_X86_64_NONE(R);
+        case llvm::ELF::R_X86_64_64:
+          return visitELF_X86_64_64(R, Value);
+        case llvm::ELF::R_X86_64_PC32:
+          return visitELF_X86_64_PC32(R, Value, SecAddr);
+        case llvm::ELF::R_X86_64_32:
+          return visitELF_X86_64_32(R, Value);
+        case llvm::ELF::R_X86_64_32S:
+          return visitELF_X86_64_32S(R, Value);
+        default:
+          HasError = true;
+          return RelocToApply();
+      }
+    }
+    return RelocToApply();
+  }
+
+  bool error() { return HasError; }
+
+private:
+  llvm::StringRef FileFormat;
+  bool HasError;
+
+  /// Operations
+
+  // Width is the width in bytes of the extend.
+  RelocToApply zeroExtend(RelocToApply r, char Width) {
+    if (Width == r.Width)
+      return r;
+    r.Value &= (1 << ((Width * 8))) - 1;
+    return r;
+  }
+  RelocToApply signExtend(RelocToApply r, char Width) {
+    if (Width == r.Width)
+      return r;
+    bool SignBit = r.Value & (1 << ((Width * 8) - 1));
+    if (SignBit) {
+      r.Value |= ~((1 << (Width * 8)) - 1);
+    } else {
+      r.Value &= (1 << (Width * 8)) - 1;
+    }
+    return r;
+  }
+
+  /// X86-64 ELF
+  RelocToApply visitELF_X86_64_NONE(RelocationRef R) {
+    return RelocToApply(0, 0);
+  }
+  RelocToApply visitELF_X86_64_64(RelocationRef R, uint64_t Value) {
+    int64_t Addend;
+    R.getAdditionalInfo(Addend);
+    return RelocToApply(Value + Addend, 8);
+  }
+  RelocToApply visitELF_X86_64_PC32(RelocationRef R, uint64_t Value,
+                                    uint64_t SecAddr) {
+    int64_t Addend;
+    R.getAdditionalInfo(Addend);
+    uint64_t Address;
+    R.getAddress(Address);
+    return RelocToApply(Value + Addend - Address, 4);
+  }
+  RelocToApply visitELF_X86_64_32(RelocationRef R, uint64_t Value) {
+    int64_t Addend;
+    R.getAdditionalInfo(Addend);
+    uint32_t Res = (Value + Addend) & 0xFFFFFFFF;
+    return RelocToApply(Res, 4);
+  }
+  RelocToApply visitELF_X86_64_32S(RelocationRef R, uint64_t Value) {
+    int64_t Addend;
+    R.getAdditionalInfo(Addend);
+    int32_t Res = (Value + Addend) & 0xFFFFFFFF;
+    return RelocToApply(Res, 4);
+  }
+};
+
+}
+}
+#endif
diff --git a/include/llvm/Support/DataExtractor.h b/include/llvm/Support/DataExtractor.h
index 8d880fd..a3ae782 100644
--- a/include/llvm/Support/DataExtractor.h
+++ b/include/llvm/Support/DataExtractor.h
@@ -10,6 +10,7 @@
 #ifndef LLVM_SUPPORT_DATAEXTRACTOR_H
 #define LLVM_SUPPORT_DATAEXTRACTOR_H
 
+#include "llvm/ADT/DenseMap.h"
 #include "llvm/ADT/StringRef.h"
 #include "llvm/Support/DataTypes.h"
 
diff --git a/lib/DebugInfo/DIContext.cpp b/lib/DebugInfo/DIContext.cpp
index ead57f9..691a92c 100644
--- a/lib/DebugInfo/DIContext.cpp
+++ b/lib/DebugInfo/DIContext.cpp
@@ -19,8 +19,9 @@ DIContext *DIContext::getDWARFContext(bool isLittleEndian,
                                       StringRef aRangeSection,
                                       StringRef lineSection,
                                       StringRef stringSection,
-                                      StringRef rangeSection) {
+                                      StringRef rangeSection,
+                                      const RelocAddrMap &Map) {
   return new DWARFContextInMemory(isLittleEndian, infoSection, abbrevSection,
                                   aRangeSection, lineSection, stringSection,
-                                  rangeSection);
+                                  rangeSection, Map);
 }
diff --git a/lib/DebugInfo/DWARFContext.h b/lib/DebugInfo/DWARFContext.h
index d10e850..4001792 100644
--- a/lib/DebugInfo/DWARFContext.h
+++ b/lib/DebugInfo/DWARFContext.h
@@ -26,6 +26,7 @@ namespace llvm {
 /// methods that a concrete implementation provides.
 class DWARFContext : public DIContext {
   bool IsLittleEndian;
+  const RelocAddrMap &RelocMap;
 
   SmallVector<DWARFCompileUnit, 1> CUs;
   OwningPtr<DWARFDebugAbbrev> Abbrev;
@@ -38,9 +39,11 @@ class DWARFContext : public DIContext {
   /// Read compile units from the debug_info section and store them in CUs.
   void parseCompileUnits();
 protected:
-  DWARFContext(bool isLittleEndian) : IsLittleEndian(isLittleEndian) {}
+  DWARFContext(bool isLittleEndian, const RelocAddrMap &Map) :
+    IsLittleEndian(isLittleEndian), RelocMap(Map) {}
 public:
   virtual void dump(raw_ostream &OS);
+
   /// Get the number of compile units in this context.
   unsigned getNumCompileUnits() {
     if (CUs.empty())
@@ -70,6 +73,7 @@ public:
       DILineInfoSpecifier Specifier = DILineInfoSpecifier());
 
   bool isLittleEndian() const { return IsLittleEndian; }
+  const RelocAddrMap &relocMap() const { return RelocMap; }
 
   virtual StringRef getInfoSection() = 0;
   virtual StringRef getAbbrevSection() = 0;
@@ -108,8 +112,9 @@ public:
                        StringRef aRangeSection,
                        StringRef lineSection,
                        StringRef stringSection,
-                       StringRef rangeSection)
-    : DWARFContext(isLittleEndian),
+                       StringRef rangeSection,
+                       const RelocAddrMap &Map = RelocAddrMap())
+    : DWARFContext(isLittleEndian, Map),
       InfoSection(infoSection),
       AbbrevSection(abbrevSection),
       ARangeSection(aRangeSection),
diff --git a/lib/DebugInfo/DWARFFormValue.cpp b/lib/DebugInfo/DWARFFormValue.cpp
index c9ecbbb..fea9fd7 100644
--- a/lib/DebugInfo/DWARFFormValue.cpp
+++ b/lib/DebugInfo/DWARFFormValue.cpp
@@ -10,6 +10,7 @@
 #include "DWARFFormValue.h"
 #include "DWARFCompileUnit.h"
 #include "DWARFContext.h"
+#include "llvm/Support/Debug.h"
 #include "llvm/Support/Dwarf.h"
 #include "llvm/Support/Format.h"
 #include "llvm/Support/raw_ostream.h"
@@ -98,8 +99,16 @@ DWARFFormValue::extractValue(DataExtractor data, uint32_t *offset_ptr,
     indirect = false;
     switch (Form) {
     case DW_FORM_addr:
-    case DW_FORM_ref_addr:
-      Value.uval = data.getUnsigned(offset_ptr, cu->getAddressByteSize());
+    case DW_FORM_ref_addr: {
+      RelocAddrMap::const_iterator AI
+        = cu->getContext().relocMap().find(*offset_ptr);
+      if (AI != cu->getContext().relocMap().end()) {
+        const std::pair<uint8_t, int64_t> &R = AI->second;
+        Value.uval = R.second;
+        *offset_ptr += R.first;
+      } else
+        Value.uval = data.getUnsigned(offset_ptr, cu->getAddressByteSize());
+    }
       break;
     case DW_FORM_exprloc:
     case DW_FORM_block:
@@ -138,9 +147,17 @@ DWARFFormValue::extractValue(DataExtractor data, uint32_t *offset_ptr,
     case DW_FORM_sdata:
       Value.sval = data.getSLEB128(offset_ptr);
       break;
-    case DW_FORM_strp:
-      Value.uval = data.getU32(offset_ptr);
+    case DW_FORM_strp: {
+      RelocAddrMap::const_iterator AI
+        = cu->getContext().relocMap().find(*offset_ptr);
+      if (AI != cu->getContext().relocMap().end()) {
+        const std::pair<uint8_t, int64_t> &R = AI->second;
+        Value.uval = R.second;
+        *offset_ptr += R.first;
+      } else
+        Value.uval = data.getU32(offset_ptr);
       break;
+    }
     case DW_FORM_udata:
     case DW_FORM_ref_udata:
       Value.uval = data.getULEB128(offset_ptr);
diff --git a/tools/llvm-dwarfdump/llvm-dwarfdump.cpp b/tools/llvm-dwarfdump/llvm-dwarfdump.cpp
index 309bc4e..e73300a 100644
--- a/tools/llvm-dwarfdump/llvm-dwarfdump.cpp
+++ b/tools/llvm-dwarfdump/llvm-dwarfdump.cpp
@@ -15,6 +15,7 @@
 #include "llvm/ADT/Triple.h"
 #include "llvm/ADT/STLExtras.h"
 #include "llvm/Object/ObjectFile.h"
+#include "llvm/Object/RelocVisitor.h"
 #include "llvm/DebugInfo/DIContext.h"
 #include "llvm/Support/CommandLine.h"
 #include "llvm/Support/Debug.h"
@@ -28,6 +29,9 @@
 #include "llvm/Support/system_error.h"
 #include <algorithm>
 #include <cstring>
+#include <list>
+#include <string>
+
 using namespace llvm;
 using namespace object;
 
@@ -67,6 +71,7 @@ static void DumpInput(const StringRef &Filename) {
   OwningPtr<ObjectFile> Obj(ObjectFile::createObjectFile(Buff.take()));
 
   StringRef DebugInfoSection;
+  RelocAddrMap RelocMap;
   StringRef DebugAbbrevSection;
   StringRef DebugLineSection;
   StringRef DebugArangesSection;
@@ -97,6 +102,57 @@ static void DumpInput(const StringRef &Filename) {
       DebugStringSection = data;
     else if (name == "debug_ranges")
       DebugRangesSection = data;
+    // Any more debug info sections go here.
+    else
+      continue;
+
+    // TODO: For now only handle relocations for the debug_info section.
+    if (name != "debug_info")
+      continue;
+
+    if (i->begin_relocations() != i->end_relocations()) {
+      uint64_t SectionSize;
+      i->getSize(SectionSize);
+      for (relocation_iterator reloc_i = i->begin_relocations(),
+                               reloc_e = i->end_relocations();
+                               reloc_i != reloc_e; reloc_i.increment(ec)) {
+        uint64_t Address;
+        reloc_i->getAddress(Address);
+        uint64_t Type;
+        reloc_i->getType(Type);
+
+        RelocVisitor V(Obj->getFileFormatName());
+        // The section address is always 0 for debug sections.
+        RelocToApply R(V.visit(Type, *reloc_i));
+        if (V.error()) {
+          SmallString<32> Name;
+          error_code ec(reloc_i->getTypeName(Name));
+          if (ec) {
+            errs() << "Aaaaaa! Nameless relocation! Aaaaaa!\n";
+          }
+          errs() << "error: failed to compute relocation: "
+                 << Name << "\n";
+          continue;
+        }
+
+        if (Address + R.Width > SectionSize) {
+          errs() << "error: " << R.Width << "-byte relocation starting "
+                 << Address << " bytes into section " << name << " which is "
+                 << SectionSize << " bytes long.\n";
+          continue;
+        }
+        if (R.Width > 8) {
+          errs() << "error: can't handle a relocation of more than 8 bytes at "
+                    "a time.\n";
+          continue;
+        }
+        DEBUG(dbgs() << "Writing " << format("%p", R.Value)
+                     << " at " << format("%p", Address)
+                     << " with width " << format("%d", R.Width)
+                     << "\n");
+        RelocMap[Address] = std::make_pair(R.Width, R.Value);
+      }
+    }
   }
 
   OwningPtr<DIContext> dictx(DIContext::getDWARFContext(/*FIXME*/true,
@@ -105,7 +161,8 @@ static void DumpInput(const StringRef &Filename) {
                                                         DebugArangesSection,
                                                         DebugLineSection,
                                                         DebugStringSection,
-                                                        DebugRangesSection));
+                                                        DebugRangesSection,
+                                                        RelocMap));
   if (Address == -1ULL) {
     outs() << Filename
            << ":\tfile format " << Obj->getFileFormatName() << "\n\n";


More information about the llvm-dev mailing list