[llvm-commits] [lld] patch for ELF Writer to get it to 'Hello world'

Sean Silva silvas at purdue.edu
Mon Dec 17 15:19:10 PST 2012


+  static inline bool classof(Chunk<target_endianness, is64Bits> *s) {
+    return true;
+  }

This is not necessary. See <http://llvm.org/docs/HowToSetUpLLVMStyleRTTI.html>.

+    // align the atom to the required modulus
+    // align the file offset and the memory offset seperately
+    // this is required so that BSS symbols are handled properly as the
+    // BSS symbols only occupy memory size and not file size

Please write comments as English prose with proper capitalization. See
<http://llvm.org/docs/CodingStandards.html#commenting>.

+  /// \brief convert the segment type to a String for diagnostics
+  ///        and printing purposes
+  StringRef segmentKindToStr() const {

This should probably be called `getSegmentKindAsString`. The current
name is really confusing.

+  /// \brief for LLVM style RTTI information
+  static inline bool classof(Section<target_endianness, is64Bits> *s) {
+    return true;
+  }

Kill this too.

+  SegmentSlice() { }

This class has quite a few members that will be default initialized.
You should doublecheck the amount of object code that this generates
before deciding to inline it.

+    if (!slice) {
+      slice = new SegmentSlice<target_endianness, is64Bits>();
+      _segmentSlices.push_back(slice);
+    }

Who owns this?

+  // For LLVM RTTI
+  static inline bool classof(Segment<target_endianness, is64Bits> *s) {
+    return true;
+  }

Kill this.

+  int pgsz() const { return _options.pageSize(); }

Why isn't this called just `pageSize()`?

+  static inline bool classof(ELFStringTable<target_endianness, is64Bits> *s) {
+    return true;
+  }
+
+  static inline bool classof(const Section<target_endianness, is64Bits> *c) {
+    return c->kind() == Section<target_endianness, is64Bits>::StringTable;
+  }
+
+  static inline bool classof(const Chunk<target_endianness, is64Bits> *c) {
+    return c->kind() == Section<target_endianness, is64Bits>::StringTable;
+  }

The top one is unnecessary, and only one of the bottom two are
necessary (whichever is highest in the inheritance hierarchy).

-class ELFSymbolTableChunk : public SectionChunk<target_endianness, is64Bits> {
+class ELFSymbolTable : public Section<target_endianness, is64Bits> {

Do this kind of renaming in a separate patch. I'm pretty sure that
this patch overall can be split up. You list 3 different changes in
your mail so those are probably a start on the road to breaking up
this gigantic patch.

+  ELFSymbolTable(const char *str,
+                 int32_t order):
+                              Section<target_endianness, is64Bits>(str,
+                                                      llvm::ELF::SHT_SYMTAB,
+                                                      0,
+                                                      order,
+                              Section<target_endianness,
is64Bits>::SymbolTable)

What is up with the formatting here?

+    std::stable_sort(_symbolTable.begin(), _symbolTable.end(), ([]
+                     (const Elf_Sym *A, const Elf_Sym *B) -> bool {
+                     return (A->getBinding() < B->getBinding());}));

This is unreadable. Please format it in a more readable fashion.

+      memcpy(dest, (*sti), sizeof(Elf_Sym));
+      dest += sizeof(Elf_Sym);

I've seen this pattern in a couple places. Extract it into a static helper.

+  void e_ident(int I, unsigned char C) { _eh.e_ident[I] = C; }
+  void e_type(uint16_t type)           { _eh.e_type = type; }
+  void e_machine(uint16_t machine)     { _eh.e_machine = machine; }
+  void e_version(uint32_t version)     { _eh.e_version = version; }
+  void e_entry(int64_t entry)         { _eh.e_entry = entry; }
+  void e_phoff(int64_t phoff)         { _eh.e_phoff = phoff; }
+  void e_shoff(int64_t shoff)         { _eh.e_shoff = shoff; }
+  void e_flags(uint32_t flags)         { _eh.e_flags = flags; }
+  void e_ehsize(uint16_t ehsize)       { _eh.e_ehsize = ehsize; }
+  void e_phentsize(uint16_t phentsize) { _eh.e_phentsize = phentsize; }
+  void e_phnum(uint16_t phnum)         { _eh.e_phnum = phnum; }
+  void e_shentsize(uint16_t shentsize) { _eh.e_shentsize = shentsize; }
+  void e_shnum(uint16_t shnum)         { _eh.e_shnum = shnum; }
+  void e_shstrndx(uint16_t shstrndx)   { _eh.e_shstrndx = shstrndx; }
+  int64_t  filesize()                 { return sizeof (Elf_Ehdr); }

These need to have "set" somewhere in their name. Alternatively (and
probably better), just directly manipulate the data members instead of
adding this pointless layer of indirection (it's not "encapsulating"
anything).

+    uint8_t *chunkBuffer = buffer->getBufferStart();
+    uint8_t *dest = chunkBuffer + this->fileOffset();

This pattern seems to be repeated in a bunch of places. I think that
this whole dance like:

+    uint8_t *chunkBuffer = buffer->getBufferStart();
+    uint8_t *dest = chunkBuffer + this->fileOffset();
+    for (auto phi = _ph.begin(); phi != _ph.end(); ++phi) {
+      memcpy(dest, (*phi), sizeof(Elf_Phdr));
+      dest += sizeof(Elf_Phdr);
+    }

can be easily reduced to a couple methods on the buffer or a tiny
BufferWriter class. e.g. it could be something like

auto bw = buffer->getBufferWriterAt(this->fileOffset());
for (auto *ph : _ph)
  bw.write(ph, sizeof(Elf_Phdr));

Factoring it like this would also expose a couple places to put a few
key assertions. This can probably be a separate patch.

+      }
+      else {

Put these on the same line.

+  void findSectionByName(chunk_iter *chunk, StringRef s) {
+    for (auto ai = sections_begin(); ai != sections_end(); ++ai) {
+      if ((*ai)->name() == s) {
+        chunk = *ai;
+        return;
+      }
+    }
+    chunk = sections_end();
+  }

std::find? Also you may consider something faster, since a linear
search here will cause other code to potentially become quadratic.

+      if ((*ai)->kind() == Chunk<target_endianness, is64Bits>::Section) {
+        section = llvm::dyn_cast<Section<target_endianness, is64Bits>>(*ai);

Just use a single dyn_cast<> and assign its result inside the `if` condition.

+{
+  _layout = new DefaultELFLayout<target_endianness, is64Bits>(options);
+}

What is the ownership of this pointer?


Finally: this patch needs tests. Tests, tests, tests.

-- Sean Silva


On Mon, Dec 17, 2012 at 8:15 AM, Shankar Easwaran
<shankare at codeaurora.org> wrote:
> Hi,
>
> Attached is a patch to get the linker for Hexagon to get to 'Hello world'.
>
> The changes are :-
>
> 1) Adds a ELF Layout class to WriterELF, becomes much easier to control
> section ordering
> 2) Adds Section/Segment semantics to the ELF linker
> 3) Easier to add Linker script support in the future.
>
> Attached is the patch ? Let me know if its ok to commit ?
>
> Thanks
>
> Shankar Easwaran
>
> --
> Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by
> the Linux Foundation
>
>
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>



More information about the llvm-commits mailing list