[llvm-commits] [PATCH 4/5] Add ELF ObjectWriter and Streamer support.

Matt Fleming matt at console-pimps.org
Sun Aug 1 15:56:59 PDT 2010


On Mon, 26 Jul 2010 14:36:35 -0700, Eli Friedman <eli.friedman at gmail.com> wrote:

Hi Eli,

thanks for this detailed review. It's much appreciated. Sorry it took so
long to get back to you.

> On Mon, Jul 26, 2010 at 1:00 PM, Matt Fleming <matt at console-pimps.org> wrote:
> +// Because all the symbol flags need to be stored in the MCSymbolData
> +// 'flags' variable we need to provide shift constants per flag type.
> +enum {
> +  STT_SHIFT = 0, // Shift value for STT_* flags.
> +  STB_SHIFT = 4, // Shift value for STB_* flags.
> +  STV_SHIFT = 8  // Shift value ofr STV_* flags.
> +};
> 
> I don't think this belongs in ELF.h; it seems specific to the MC implementation.

Agreed. I've moved it into include/llvm/MC/MCELFSymbolFlags.h and added
some more enums to stop requiring all the shifting.

> > +  static bool isFixupKindPCRel(unsigned Kind) {
> > +    switch (Kind) {
> > +    default:
> > +      return false;
> > +    case X86::reloc_pcrel_1byte:
> > +    case X86::reloc_pcrel_4byte:
> > +    case X86::reloc_riprel_4byte:
> > +    case X86::reloc_riprel_4byte_movq_load:
> > +      return true;
> > +    }
> > +  }
> 
> Change the name to indicate that this x86-specific?

Done.

> > +  static bool isFixupKindRIPRel(unsigned Kind) {
> > +    return Kind == X86::reloc_riprel_4byte ||
> > +      Kind == X86::reloc_riprel_4byte_movq_load;
> > +  }
> 
> Same.

Done.

> > +  // Emit the ELF header.
> > +  void WriteHeader(uint64_t SectionDataSize, unsigned NumberOfSections) {
> 
> (General comment) Could you move the large method implementations out of the
> class definition?

Yeah, that seems fair. Done.

> > +    Write8(ELF::EV_CURRENT);        // e_ident[EI_VERSION]
> > +    Write8(ELF::ELFOSABI_LINUX);    // e_ident[EI_OSABI]
> > +    Write8(0);                 // e_ident[EI_ABIVERSION]
> 
> Factor this out so that it's obvious something needs to change here to make
> it crossplatform.

Hmm.. I'm not quite sure how to do that. The infrastructure for this
type of thing doesn't exist. Daniel, do you have any advice on this or
how it should be done?

> > +    // FIXME
> > +    Write16(ELF::EM_X86_64); // e_machine = target
> 
> Same.
> 
> > +    Write32(ELF::EV_CURRENT);         // e_version
> > +    WriteWord(0);                    // e_entry, no entry point in .o file
> > +    WriteWord(0);                    // e_phoff, no program header for .o
> > +    WriteWord(SectionDataSize + 64);  // e_shoff = sec hdr table off in bytes
> 
> Odd-looking indentation.

I've sorted out the indentation for all this code.

> > +    // FIXME
> > +    Write32(0);   // e_flags = whatever the target wants
> 
> Factor out.
> 
> > +    Write16(Is64Bit ? 64 : 52);  // e_ehsize = ELF header size
> 
> sizeof(Elf32_Ehdr)?  Or at least name these values somehow...

Yeah, sizeof(Elf32_Ehdr), etc seems sane. Done.

> > +    // e_shentsize = Section header entry size
> > +    Write16(Is64Bit ? 64 : 40);
> 
> See above.

Done.

> > +    // e_shnum     = # of section header ents
> > +    Write16(NumberOfSections);
> > +
> > +    // e_shstrndx  = Section # of '.shstrtab'
> > +    Write16(ShstrtabIndex);
> > +  }
> > +
> > +  void WriteSymbolEntry(MCDataFragment *F, uint64_t name, uint8_t info,
> > +                        uint64_t value, uint64_t size,
> > +                        uint8_t other, uint16_t shndx) {
> > +    if (Is64Bit) {
> > +      F->getContents() += StringRef((const char *)&name, 4);  // st_name
> > +      F->getContents() += StringRef((const char *)&info, 1);  // st_info
> > +      F->getContents() += StringRef((const char *)&other, 1); // st_other
> > +      F->getContents() += StringRef((const char *)&shndx, 2); // st_shndx
> > +      F->getContents() += StringRef((const char *)&value, 8); // st_value
> > +      // FIXME
> > +      F->getContents() += StringRef((const char *)&size, 8);  // st_size
> 
> FIXME what?  Same for all the unlableled FIXMEs.

These particular FIXMEs are no longer needed. I'll add descriptions to
the other unlabeled FIXMEs.

> > +      Value = Layout.getSymbolAddress(&Data);
> > +      MCValue Res;
> > +      if (Data.getSizeSymbol()->EvaluateAsRelocatable(Res, &Layout)) {
> > +        MCSymbolData &A =
> > +            Layout.getAssembler().getSymbolData(Res.getSymA()->getSymbol());
> > +        MCSymbolData &B =
> > +            Layout.getAssembler().getSymbolData(Res.getSymB()->getSymbol());
> > +
> > +       Size = Layout.getSymbolAddress(&A) - Layout.getSymbolAddress(&B);
> 
> How is this different from EvaluateAsAbsolute?

I'm not sure if EvalulateAsAbsolute would work here. I think the
expression may not be constant. Roman, what do you think?

> > +  // XXX: this is currently X86_64 only
> 
> Don't use XXX; use FIXME instead.

Done.

> > +      if (Symbol.isUndefined()) {
> > +        MSD.SectionIndex = ELF::SHN_UNDEF;
> > +       // XXX: for some reason we dont Emit* this
> > +       it->setFlags(it->getFlags() | (ELF::STB_GLOBAL << ELF::STB_SHIFT));
> > +        UndefinedSymbolData.push_back(MSD);
> 
> Indentation?
>
> > +
> > +  void WriteRelocations(MCAssembler &Asm, MCAsmLayout &Layout) {
> > +    for (MCAssembler::const_iterator it = Asm.begin(),
> > +           ie = Asm.end(); it != ie; ++it) {
> > +      const MCSectionData &SD = *it;
> > +
> > +      WriteRelocation(Asm, Layout, SD);
> 
> WriteRelocation(Asm, Layout, *it);

Fixed.

> > +  void WriteSecHdrEntry(uint32_t Name, uint32_t Type, uint64_t Flags,
> > +                       uint64_t Address, uint64_t Offset,
> > +                       uint64_t Size, uint32_t Link, uint32_t Info,
> > +                       uint64_t Alignment, uint64_t EntrySize) {
> > +    // Suppose that all non-initialized EntrySizes are actually a zero
> > +    if (EntrySize == ~0U)
> > +      EntrySize = 0;
> 
> Messy... would it be possible to just set it correctly in the first place?

Yeah, I've reworked this because of your comments on the EntrySize
patch.

> > +         F->getContents() +=  StringRef((const char *)&entry.r_offset, 8);
> > +         F->getContents() +=  StringRef((const char *)&entry.r_info, 8);
> 
> Is this endian-safe?

Nope. This is little endian specific. I've fixed this up now.

> > +    // Write out the ELF header ...
> > +    WriteHeader(SectionDataSize, NumSections);
> > +    FileOff = Is64Bit ? 64 : 52;
> 
> Again, names...

Done.

> > +    // ... and then the section header table.
> > +    // Should we align the section header table?
> > +    //
> > +    // Null secton first.
> 
> *section

Fixed.

> > +      switch(Section.getType()) {
> > +      case ELF::SHT_DYNAMIC:
> > +        sh_link = SectionStringTableIndex[&it->getSection()];
> > +        sh_info = 0;
> > +        break;
> > +
> > +      case ELF::SHT_REL:
> > +      case ELF::SHT_RELA:
> > +        const MCSection *SymtabSection;
> > +        const MCSection *InfoSection;
> > +        const StringRef *SectionName;
> > +        const MCSectionData *SymtabSD;
> > +        const MCSectionData *InfoSD;
> > +
> > +        SymtabSection = Asm.getContext().getELFSection(".symtab", ELF::SHT_SYMTAB, 0,
> > +                        SectionKind::getReadOnly(), false);
> > +        SymtabSD = &Asm.getSectionData(*SymtabSection);
> > +        // we have to count the empty section in too
> > +        sh_link = SymtabSD->getLayoutOrder() + 1;
> > +
> > +        SectionName = &Section.getSectionName();
> > +        SectionName = &SectionName->slice(5, SectionName->size());
> > +        InfoSection = Asm.getContext().getELFSection(*SectionName,
> > +          ELF::SHT_PROGBITS, 0, SectionKind::getReadOnly(), false);
> > +        InfoSD = &Asm.getSectionData(*InfoSection);
> > +        sh_info = InfoSD->getLayoutOrder() + 1;
> > +        break;
> > +
> > +      case ELF::SHT_SYMTAB:
> > +      case ELF::SHT_DYNSYM:
> > +        sh_link = StringTableIndex;
> > +        sh_info = LastLocalSymbolIndex;
> > +        break;
> > +
> > +      case ELF::SHT_HASH:
> > +      case ELF::SHT_GROUP:
> > +      case ELF::SHT_SYMTAB_SHNDX:
> 
> "default:" ?

Sure, that seems sensible. Fixed.

> > +        assert(0 && "FIXME: sh_type value not supported!");
> > +        break;
> > +      }
> > +
> > +      WriteSecHdrEntry(SectionStringTableIndex[&it->getSection()],
> > +                      Section.getType(), Section.getFlags(),
> > +                      Layout.getSectionAddress(&SD),
> > +                      SectionOffsetMap.lookup(&SD.getSection()),
> > +                      Layout.getSectionSize(&SD), sh_link,
> > +                       sh_info, SD.getAlignment(),
> > +                       SD.getEntrySize());
> 
> Indentation.
> 
> +namespace {
> +
> +class MCELFStreamer : public MCStreamer {
> +private:
> 
> Should inherit from MCObjectStreamer.

Fixed. 

> > +  MCFragment *getCurrentFragment() const {
> > +    assert(CurSectionData && "No current section!");
> > +
> > +    if (!CurSectionData->empty())
> > +      return &CurSectionData->getFragmentList().back();
> > +
> > +    return 0;
> > +  }
> 
> Shadows MCObjectStreamer::getCurrentFragment().
> 

Deleted.

> > +  /// fragment is not a data fragment.
> > +  MCDataFragment *getOrCreateDataFragment() const {
> > +    MCDataFragment *F = dyn_cast_or_null<MCDataFragment>(getCurrentFragment());
> > +    if (!F)
> > +      F = new MCDataFragment(CurSectionData);
> > +    return F;
> > +  }
> 
> Shadows MCObjectStreamer::getOrCreateDataFragment().

Deleted.

> > +  MCAssembler &getAssembler() { return Assembler; }
> 
> Shadows MCObjectStreamer::getAssembler().

Deleted.

> > +  const MCExpr *AddValueSymbols(const MCExpr *Value) {
> > +    switch (Value->getKind()) {
> > +    case MCExpr::Target: assert(0 && "Can't handle target exprs yet!");
> > +    case MCExpr::Constant:
> > +      break;
> > +
> > +    case MCExpr::Binary: {
> > +      const MCBinaryExpr *BE = cast<MCBinaryExpr>(Value);
> > +      AddValueSymbols(BE->getLHS());
> > +      AddValueSymbols(BE->getRHS());
> > +      break;
> > +    }
> > +
> > +    case MCExpr::SymbolRef:
> > +      Assembler.getOrCreateSymbolData(
> > +        cast<MCSymbolRefExpr>(Value)->getSymbol());
> > +      break;
> > +
> > +    case MCExpr::Unary:
> > +      AddValueSymbols(cast<MCUnaryExpr>(Value)->getSubExpr());
> > +      break;
> > +    }
> > +
> > +    return Value;
> > +  }
> 
> Shadows MCObjectStreamer::AddValueSymbols.

Deleted.

> > +
> > +  virtual void EmitELFSize(MCSymbol *Symbol, const MCExpr *Value) {
> 
> Push definition outside of class defn.

I fixed a bug in this method and now it's only 2 lines long. Does the
definition still need to be pushed outside of the class definition?

> > +  virtual void EmitDwarfFileDirective(unsigned FileNo, StringRef Filename) {
> > +    errs() << "FIXME: MCELFStreamer:EmitDwarfFileDirective not implemented\n";
> > +  }
> 
> DEBUG(dbgs() << "")?

OK. Fixed.

> > +  virtual void EmitInstruction(const MCInst &Inst);
> > +  virtual void Finish();
> > +
> > +  /// @}
> > +};
> > +
> > +} // end anonymous namespace.
> > +
> > +void MCELFStreamer::SwitchSection(const MCSection *Section) {
> > +  assert(Section && "Cannot switch to a null section!");
> > +
> > +  // If already in this section, then this is a noop.
> > +  if (Section == CurSection) return;
> > +
> > +  CurSection = Section;
> > +  CurSectionData = &Assembler.getOrCreateSectionData(*Section);
> > +}
> 
> Identical to MCObjectStreamer::SwitchSection.

Deleted.

> > +void MCELFStreamer::EmitAssignment(MCSymbol *Symbol, const MCExpr *Value) {
> > +  // Only absolute symbols can be redefined.
> > +  assert((Symbol->isUndefined() || Symbol->isAbsolute()) &&
> > +         "Cannot define a symbol twice!");
> > +
> > +  // FIXME: Lift context changes into super class.
> > +  // FIXME: Set associated section.
> > +  //  Symbol->setValue(AddValueSymbols(Value));
> > +}
> 
> Should this be different from MCMachOStreamer::EmitAssignment?

I don't think so. I've fixed this now.

> > +void MCELFStreamer::EmitSymbolDesc(MCSymbol *Symbol, unsigned DescValue) {
> > +}
> 
> assert(0)?  Or what is this supposed to be useful for?

I actually don't think the .desc directive makes sense for ELF. I've
deleted it.

> > +void MCELFStreamer::EmitCommonSymbol(MCSymbol *Symbol, uint64_t Size,
> > +                                       unsigned ByteAlignment) {
> > +  MCSymbolData &SD = Assembler.getOrCreateSymbolData(*Symbol);
> > +
> > +  if ((SD.getFlags() & (0xff << ELF::STB_SHIFT)) == ELF::STB_LOCAL << ELF::STB_SHIFT) {
> 
> STB_* is only 4 bits, so 0xf, no?

Correct. Fixed.

> > +void MCELFStreamer::EmitZerofill(const MCSection *Section, MCSymbol *Symbol,
> > +                                   unsigned Size, unsigned ByteAlignment) {
> 
> AFAIK, .zerofill is Darwin-only.

True. Deleted.

> > +void MCELFStreamer::EmitTBSSSymbol(const MCSection *Section, MCSymbol *Symbol,
> > +                                   uint64_t Size, unsigned ByteAlignment) {
> 
> Also Darwin only, AFAIK.

True. Deleted.

> > +void MCELFStreamer::EmitBytes(StringRef Data, unsigned AddrSpace) {
> > +  getOrCreateDataFragment()->getContents().append(Data.begin(), Data.end());
> > +}
> 
> Same as MCMachOStreamer::EmitBytes; toss in a TODO?

Done.

> > +void MCELFStreamer::EmitValue(const MCExpr *Value, unsigned Size,
> > +                                unsigned AddrSpace) {
> > +  MCDataFragment *DF = getOrCreateDataFragment();
> > +
> > +  // Avoid fixups when possible.
> > +  int64_t AbsValue;
> > +  if (AddValueSymbols(Value)->EvaluateAsAbsolute(AbsValue)) {
> > +    // FIXME: Endianness assumption.
> > +    for (unsigned i = 0; i != Size; ++i)
> > +      DF->getContents().push_back(uint8_t(AbsValue >> (i * 8)));
> > +  } else {
> > +    DF->addFixup(MCFixup::Create(DF->getContents().size(), AddValueSymbols(Value),
> > +                                 MCFixup::getKindForSize(Size)));
> > +    DF->getContents().resize(DF->getContents().size() + Size, 0);
> > +  }
> > +}
> 
> Same as MCMachOStreamer::EmitValue; toss in a TODO?

Done.

> > +void MCELFStreamer::EmitValueToAlignment(unsigned ByteAlignment,
> > +                                           int64_t Value, unsigned ValueSize,
> > +                                           unsigned MaxBytesToEmit) {
> > +  if (MaxBytesToEmit == 0)
> > +    MaxBytesToEmit = ByteAlignment;
> > +  new MCAlignFragment(ByteAlignment, Value, ValueSize, MaxBytesToEmit,
> > +                      CurSectionData);
> > +
> > +  // Update the maximum alignment on the current section if necessary.
> > +  if (ByteAlignment > CurSectionData->getAlignment())
> > +    CurSectionData->setAlignment(ByteAlignment);
> > +}
> 
> Same as MCMachOStreamer::EmitValueToAlignment; toss in a TODO?

Done.

> > +void MCELFStreamer::EmitCodeAlignment(unsigned ByteAlignment,
> > +                                        unsigned MaxBytesToEmit) {
> > +  if (MaxBytesToEmit == 0)
> > +    MaxBytesToEmit = ByteAlignment;
> > +  new MCAlignFragment(ByteAlignment, 0, 1, MaxBytesToEmit,
> > +                      CurSectionData);
> > +
> > +  // Update the maximum alignment on the current section if necessary.
> > +  if (ByteAlignment > CurSectionData->getAlignment())
> > +    CurSectionData->setAlignment(ByteAlignment);
> > +}
> 
> Same as MCMachOStreamer::EmitCodeAlignment; toss in a TODO?

Done.

> > +void MCELFStreamer::EmitValueToOffset(const MCExpr *Offset,
> > +                                        unsigned char Value) {
> > +  new MCOrgFragment(*Offset, Value, CurSectionData);
> > +}
> 
> Same as MCMachOStreamer::EmitValueToOffset; toss in a TODO?

Done. I'll post the updated series sometime this week.

Thanks!



More information about the llvm-commits mailing list