[PATCH] [lld][ELF] Add support for reading dynamic libraries.

Michael Spencer bigcheesegs at gmail.com
Fri Feb 8 14:01:17 PST 2013



================
Comment at: include/lld/Core/LLVM.h:87-88
@@ +86,4 @@
+  size_t operator()(const llvm::StringRef &s) const {
+    using llvm::hash_value;
+    return hash_value(s);
+  }
----------------
Shankar Kalpathi Easwaran wrote:
> Wouldnt this be simpler ? return llvm::hash_value(s) ?
Yes.

================
Comment at: include/lld/Core/File.h:168-176
@@ -167,9 +167,11 @@
     virtual atom_iterator<T> begin() const {
-      return atom_iterator<T>(*this, reinterpret_cast<const void*>
-                                                              (_atoms.data()));
+      return atom_iterator<
+          T>(*this, _atoms.empty() ? 0 :
+                 reinterpret_cast<const void *>(_atoms.data()));
     }
     virtual atom_iterator<T> end() const{
-      return atom_iterator<T>(*this, reinterpret_cast<const void*>
-                                              (_atoms.data() + _atoms.size()));
+      return atom_iterator<
+          T>(*this, _atoms.empty() ? 0 :
+                 reinterpret_cast<const void *>(_atoms.data() + _atoms.size()));
     }
     virtual const T *deref(const void *it) const {
----------------
Shankar Kalpathi Easwaran wrote:
> clang-format is making this unreadable
I'll fix it.

================
Comment at: lib/ReaderWriter/ELF/DynamicFile.h:51-60
@@ +50,12 @@
+    // it exists.
+    for (auto i = obj.begin_elf_dynamic_symbols(),
+              e = obj.end_elf_dynamic_symbols();
+         i != e; ++i) {
+      StringRef name;
+      if (error_code ec =
+              obj.getSymbolName(obj.getDynamicSymbolTableSectionHeader(), &*i,
+                                name))
+        return ec;
+      file->_nameToSym[name].first = &*i;
+    }
+
----------------
Shankar Kalpathi Easwaran wrote:
> Just reading the top level library wont be enough. 
> 
> For example :-
> 
> cat > 1.c << \!
> int a;
> !
> 
> cat > 2.c << \!
> int b;
> !
> 
> gcc -fPIC -shared 1.c -o lib1.so
> gcc -fPIC -shared  2.c -L. -l1 -o lib2.so
> 
> $readelf --dyn-syms lib2.so | grep -w a\$
> $readelf --dyn-syms lib1.so | grep -w a\$
>      9: 0000000000201014     4 OBJECT  GLOBAL DEFAULT   23 a
> 
> 
> The symbol a would be only shown in lib1.so but not in lib2.so.
You have to explicitly link against lib2.so anyway.

  $ echo "int a;" | clang -fPIC -shared -x c - -o liba.so
  $ echo "extern int a; int foo() {return a;}" | clang -fPIC -shared -x c - -o libfoo.so -L. -la -Wl,-rpath=.
  $ echo "extern int a; int foo(); int main() {return foo() + a;} " | clang -fPIE -x c - -o shared -L. -lfoo -Wl,-rpath=.
  /usr/bin/ld: /tmp/--pyGKwV.o: undefined reference to symbol 'a'
  /usr/bin/ld: note: 'a' is defined in DSO ./liba.so so try adding it to the linker command line
  ./liba.so: could not read symbols: Invalid operation
  clang-3: error: linker command failed with exit code 1 (use -v to see invocation)

================
Comment at: lib/ReaderWriter/ELF/DynamicFile.h:65-66
@@ +64,4 @@
+
+  virtual const atom_collection<DefinedAtom> &defined() const {
+    return _definedAtoms;
+  }
----------------
Shankar Kalpathi Easwaran wrote:
> definedAtoms. undefinedAtoms are not being populated here.
There's no reason to. We only care about symbols if they are referenced. We shouldn't waste time creating them otherwise.

================
Comment at: lib/ReaderWriter/ELF/DynamicFile.h:82
@@ +81,3 @@
+  virtual const SharedLibraryAtom *exports(StringRef name,
+                                           bool dataSymbolOnly) const {
+    // See if we have the symbol.
----------------
Shankar Kalpathi Easwaran wrote:
> Shankar Kalpathi Easwaran wrote:
> > dataSymbolOnly is not being used in the code
> maybe const can be removed ?
This function is logically const. Given the same input it will always produce the same output.

================
Comment at: lib/ReaderWriter/ELF/DynamicFile.h:82-84
@@ +81,5 @@
+  virtual const SharedLibraryAtom *exports(StringRef name,
+                                           bool dataSymbolOnly) const {
+    // See if we have the symbol.
+    auto sym = _nameToSym.find(name);
+    if (sym == _nameToSym.end())
----------------
Michael Spencer wrote:
> Shankar Kalpathi Easwaran wrote:
> > Shankar Kalpathi Easwaran wrote:
> > > dataSymbolOnly is not being used in the code
> > maybe const can be removed ?
> This function is logically const. Given the same input it will always produce the same output.
ELF doesn't have any concept of it.

================
Comment at: lib/ReaderWriter/ELF/DynamicFile.h:85
@@ +84,3 @@
+    auto sym = _nameToSym.find(name);
+    if (sym == _nameToSym.end())
+      return nullptr;
----------------
Shankar Kalpathi Easwaran wrote:
> The dynamic symbols may have undefined symbols and absolute symbols too, which the export should not pick.
It shouldn't pick undefined. It can pick absolute.

================
Comment at: lib/ReaderWriter/ELF/DynamicFile.h:111
@@ +110,3 @@
+                const SharedLibraryAtom *> > _nameToSym;
+  mutable llvm::BumpPtrAllocator _alloc;
+};
----------------
Shankar Kalpathi Easwaran wrote:
> Why mutable ? Once const is removed, the mutable keyword can be removed from both the member variables.
The function is logically const. And the resolver takes const Files.

================
Comment at: lib/ReaderWriter/ELF/DynamicFile.h:108-110
@@ +107,5 @@
+  mutable std::unordered_map<
+      StringRef,
+      std::pair<const typename llvm::object::ELFObjectFile<ELFT>::Elf_Sym *,
+                const SharedLibraryAtom *> > _nameToSym;
+  mutable llvm::BumpPtrAllocator _alloc;
----------------
Shankar Kalpathi Easwaran wrote:
> I think using a class might be better than using std::pair
k

================
Comment at: lib/ReaderWriter/ELF/Reader.cpp:58
@@ +57,3 @@
+  template <class ELFT>
+  static result_type create(const lld::ELFTargetInfo &ti,
+                            std::unique_ptr<llvm::MemoryBuffer> mb) {
----------------
Shankar Kalpathi Easwaran wrote:
> why static ?
To make it not a member function.

================
Comment at: lib/ReaderWriter/ELF/Reader.cpp:162-163
@@ -65,4 +161,4 @@
 
     std::size_t MaxAlignment =
         1ULL << llvm::CountTrailingZeros_64(uintptr_t(mb->getBufferStart()));
 
----------------
Shankar Kalpathi Easwaran wrote:
> Why MaxAlignment for shared libraries too ?
Because there's no reason not to.

================
Comment at: lib/ReaderWriter/ELF/Reader.cpp:168-170
@@ -71,61 +167,5 @@
     case llvm::sys::ELF_Relocatable_FileType: {
-      std::pair<unsigned char, unsigned char> Ident = getElfArchType(&*mb);
-      std::unique_ptr<File> f;
-      // Instantiate the correct File template instance based on the Ident
-      // pair. Once the File is created we push the file to the vector of files
-      // already created during parser's life.
-      if (Ident.first == llvm::ELF::ELFCLASS32 &&
-          Ident.second == llvm::ELF::ELFDATA2LSB) {
-#if !LLVM_IS_UNALIGNED_ACCESS_FAST
-        if (MaxAlignment >= 4)
-          f.reset(new ELFFile<ELFType<llvm::support::little, 4, false> >(
-                          _elfTargetInfo, std::move(mb), ec));
-        else
-#endif
-        if (MaxAlignment >= 2)
-          f.reset(new ELFFile<ELFType<llvm::support::little, 2, false> >(
-                          _elfTargetInfo, std::move(mb), ec));
-        else
-          llvm_unreachable("Invalid alignment for ELF file!");
-      } else if (Ident.first == llvm::ELF::ELFCLASS32 &&
-                 Ident.second == llvm::ELF::ELFDATA2MSB) {
-#if !LLVM_IS_UNALIGNED_ACCESS_FAST
-        if (MaxAlignment >= 4)
-          f.reset(new ELFFile<ELFType<llvm::support::big, 4, false> >(
-                          _elfTargetInfo, std::move(mb), ec));
-        else
-#endif
-        if (MaxAlignment >= 2)
-          f.reset(new ELFFile<ELFType<llvm::support::big, 2, false> >(
-                          _elfTargetInfo, std::move(mb), ec));
-        else
-          llvm_unreachable("Invalid alignment for ELF file!");
-      } else if (Ident.first == llvm::ELF::ELFCLASS64 &&
-                 Ident.second == llvm::ELF::ELFDATA2MSB) {
-#if !LLVM_IS_UNALIGNED_ACCESS_FAST
-        if (MaxAlignment >= 8)
-          f.reset(new ELFFile<ELFType<llvm::support::big, 8, true> >(
-                          _elfTargetInfo, std::move(mb), ec));
-        else
-#endif
-        if (MaxAlignment >= 2)
-          f.reset(new ELFFile<ELFType<llvm::support::big, 2, true> >(
-                          _elfTargetInfo, std::move(mb), ec));
-        else
-          llvm_unreachable("Invalid alignment for ELF file!");
-      } else if (Ident.first == llvm::ELF::ELFCLASS64 &&
-                 Ident.second == llvm::ELF::ELFDATA2LSB) {
-#if !LLVM_IS_UNALIGNED_ACCESS_FAST
-        if (MaxAlignment >= 8)
-          f.reset(new ELFFile<ELFType<llvm::support::little, 8, true> >(
-                          _elfTargetInfo, std::move(mb), ec));
-        else
-#endif
-        if (MaxAlignment >= 2)
-          f.reset(new ELFFile<ELFType<llvm::support::little, 2, true> >(
-                          _elfTargetInfo, std::move(mb), ec));
-        else
-          llvm_unreachable("Invalid alignment for ELF file!");
-      }
-      if (!ec)
-        result.push_back(std::move(f));
+      std::unique_ptr<File> f(createELF<ELFFile>(getElfArchType(&*mb),
+                                                 MaxAlignment, _elfTargetInfo,
+                                                 std::move(mb), ec));
+      if (ec)
----------------
Shankar Kalpathi Easwaran wrote:
> Why is the return value not taken and compared, may be a TODO action item can be set at this location.
Because it can't fail.

================
Comment at: lib/ReaderWriter/YAML/ReaderWriterYAML.cpp:224-232
@@ -226,6 +223,11 @@
+  virtual lld::File::atom_iterator<T> begin() const {
+    return lld::File::atom_iterator<
+        T>(*this,
+           _atoms.empty() ? 0 : reinterpret_cast<const void *>(_atoms.data()));
   }
-  virtual lld::File::atom_iterator<T> end() const{ 
-    return lld::File::atom_iterator<T>(*this, reinterpret_cast<const void*>
-                                            (_atoms.data() + _atoms.size()));
+  virtual lld::File::atom_iterator<T> end() const{
+    return lld::File::atom_iterator<
+        T>(*this, _atoms.empty() ? 0 :
+               reinterpret_cast<const void *>(_atoms.data() + _atoms.size()));
   }
   virtual const T *deref(const void *it) const {
----------------
Shankar Kalpathi Easwaran wrote:
> Was this a bug earlier ? Any testcase that was failing ?
Yes. The value returned by vector::data on an empty vector is unspecified (by omission). This code relies on it being 0.

================
Comment at: lib/ReaderWriter/ELF/DynamicFile.h:33-43
@@ +32,13 @@
+
+    llvm::OwningPtr<llvm::object::Binary> binaryFile;
+    if (error_code ec = createBinary(mb.release(), binaryFile))
+      return ec;
+
+    // Point Obj to correct class and bitwidth ELF object
+    file->_objFile.reset(
+        dyn_cast<llvm::object::ELFObjectFile<ELFT>>(binaryFile.get()));
+
+    if (!file->_objFile)
+      return make_error_code(llvm::object::object_error::invalid_file_type);
+
+    binaryFile.take();
----------------
Shankar Kalpathi Easwaran wrote:
> Seen this code in other places in lld, should this be made a utility function ?
Possibly.

================
Comment at: lib/ReaderWriter/ELF/DynamicFile.h:23-26
@@ +22,6 @@
+namespace elf {
+template <class ELFT> class DynamicFile LLVM_FINAL : public SharedLibraryFile {
+public:
+  static ErrorOr<std::unique_ptr<DynamicFile> > create(
+      const ELFTargetInfo &ti, std::unique_ptr<llvm::MemoryBuffer> mb) {
+    std::unique_ptr<DynamicFile> file(
----------------
Shankar Kalpathi Easwaran wrote:
> It might be useful to store the soname, http://tldp.org/HOWTO/Program-Library-HOWTO/shared-libraries.html
k

================
Comment at: lib/ReaderWriter/ELF/Atoms.h:476-478
@@ +475,5 @@
+
+  virtual bool canBeNullAtRuntime() const {
+    return _symbol->getBinding() == llvm::ELF::STB_WEAK;
+  }
+
----------------
Shankar Kalpathi Easwaran wrote:
> Couldnt follow this in the context of shared libraries ? Weak symbols are only applicable while generating static executables is what I thought
They apply to shared objects too.

================
Comment at: lib/ReaderWriter/ELF/Atoms.h:447-450
@@ +446,6 @@
+template <class ELFT>
+class ELFDynamicAtom LLVM_FINAL : public SharedLibraryAtom {
+  typedef llvm::object::Elf_Sym_Impl<ELFT> Elf_Sym;
+
+public:
+  ELFDynamicAtom(const DynamicFile<ELFT> &file, StringRef symbolName,
----------------
Shankar Kalpathi Easwaran wrote:
> We need ELFDynamicDefinedAtom, ELFDynamicUndefinedAtom, ELFDynamicAbsoluteAtom. Unresolved symbols from shared libraries should take part in symbol resolution as well. Similiarly Absolute symbols in shared libraries.
The linker doesn't care about undefined symbols in a shared object. The dynamic loader resolves these. The others are also unneeded.


http://llvm-reviews.chandlerc.com/D382



More information about the llvm-commits mailing list