[llvm-commits] [PATCH 4/5] Add ELF ObjectWriter and Streamer support.
Matt Fleming
matt at console-pimps.org
Sun Aug 1 15:56:59 PDT 2010
On Mon, 26 Jul 2010 14:36:35 -0700, Eli Friedman <eli.friedman at gmail.com> wrote:
Hi Eli,
thanks for this detailed review. It's much appreciated. Sorry it took so
long to get back to you.
> On Mon, Jul 26, 2010 at 1:00 PM, Matt Fleming <matt at console-pimps.org> wrote:
> +// Because all the symbol flags need to be stored in the MCSymbolData
> +// 'flags' variable we need to provide shift constants per flag type.
> +enum {
> + STT_SHIFT = 0, // Shift value for STT_* flags.
> + STB_SHIFT = 4, // Shift value for STB_* flags.
> + STV_SHIFT = 8 // Shift value ofr STV_* flags.
> +};
>
> I don't think this belongs in ELF.h; it seems specific to the MC implementation.
Agreed. I've moved it into include/llvm/MC/MCELFSymbolFlags.h and added
some more enums to stop requiring all the shifting.
> > + static bool isFixupKindPCRel(unsigned Kind) {
> > + switch (Kind) {
> > + default:
> > + return false;
> > + case X86::reloc_pcrel_1byte:
> > + case X86::reloc_pcrel_4byte:
> > + case X86::reloc_riprel_4byte:
> > + case X86::reloc_riprel_4byte_movq_load:
> > + return true;
> > + }
> > + }
>
> Change the name to indicate that this x86-specific?
Done.
> > + static bool isFixupKindRIPRel(unsigned Kind) {
> > + return Kind == X86::reloc_riprel_4byte ||
> > + Kind == X86::reloc_riprel_4byte_movq_load;
> > + }
>
> Same.
Done.
> > + // Emit the ELF header.
> > + void WriteHeader(uint64_t SectionDataSize, unsigned NumberOfSections) {
>
> (General comment) Could you move the large method implementations out of the
> class definition?
Yeah, that seems fair. Done.
> > + Write8(ELF::EV_CURRENT); // e_ident[EI_VERSION]
> > + Write8(ELF::ELFOSABI_LINUX); // e_ident[EI_OSABI]
> > + Write8(0); // e_ident[EI_ABIVERSION]
>
> Factor this out so that it's obvious something needs to change here to make
> it crossplatform.
Hmm.. I'm not quite sure how to do that. The infrastructure for this
type of thing doesn't exist. Daniel, do you have any advice on this or
how it should be done?
> > + // FIXME
> > + Write16(ELF::EM_X86_64); // e_machine = target
>
> Same.
>
> > + Write32(ELF::EV_CURRENT); // e_version
> > + WriteWord(0); // e_entry, no entry point in .o file
> > + WriteWord(0); // e_phoff, no program header for .o
> > + WriteWord(SectionDataSize + 64); // e_shoff = sec hdr table off in bytes
>
> Odd-looking indentation.
I've sorted out the indentation for all this code.
> > + // FIXME
> > + Write32(0); // e_flags = whatever the target wants
>
> Factor out.
>
> > + Write16(Is64Bit ? 64 : 52); // e_ehsize = ELF header size
>
> sizeof(Elf32_Ehdr)? Or at least name these values somehow...
Yeah, sizeof(Elf32_Ehdr), etc seems sane. Done.
> > + // e_shentsize = Section header entry size
> > + Write16(Is64Bit ? 64 : 40);
>
> See above.
Done.
> > + // e_shnum = # of section header ents
> > + Write16(NumberOfSections);
> > +
> > + // e_shstrndx = Section # of '.shstrtab'
> > + Write16(ShstrtabIndex);
> > + }
> > +
> > + void WriteSymbolEntry(MCDataFragment *F, uint64_t name, uint8_t info,
> > + uint64_t value, uint64_t size,
> > + uint8_t other, uint16_t shndx) {
> > + if (Is64Bit) {
> > + F->getContents() += StringRef((const char *)&name, 4); // st_name
> > + F->getContents() += StringRef((const char *)&info, 1); // st_info
> > + F->getContents() += StringRef((const char *)&other, 1); // st_other
> > + F->getContents() += StringRef((const char *)&shndx, 2); // st_shndx
> > + F->getContents() += StringRef((const char *)&value, 8); // st_value
> > + // FIXME
> > + F->getContents() += StringRef((const char *)&size, 8); // st_size
>
> FIXME what? Same for all the unlableled FIXMEs.
These particular FIXMEs are no longer needed. I'll add descriptions to
the other unlabeled FIXMEs.
> > + Value = Layout.getSymbolAddress(&Data);
> > + MCValue Res;
> > + if (Data.getSizeSymbol()->EvaluateAsRelocatable(Res, &Layout)) {
> > + MCSymbolData &A =
> > + Layout.getAssembler().getSymbolData(Res.getSymA()->getSymbol());
> > + MCSymbolData &B =
> > + Layout.getAssembler().getSymbolData(Res.getSymB()->getSymbol());
> > +
> > + Size = Layout.getSymbolAddress(&A) - Layout.getSymbolAddress(&B);
>
> How is this different from EvaluateAsAbsolute?
I'm not sure if EvalulateAsAbsolute would work here. I think the
expression may not be constant. Roman, what do you think?
> > + // XXX: this is currently X86_64 only
>
> Don't use XXX; use FIXME instead.
Done.
> > + if (Symbol.isUndefined()) {
> > + MSD.SectionIndex = ELF::SHN_UNDEF;
> > + // XXX: for some reason we dont Emit* this
> > + it->setFlags(it->getFlags() | (ELF::STB_GLOBAL << ELF::STB_SHIFT));
> > + UndefinedSymbolData.push_back(MSD);
>
> Indentation?
>
> > +
> > + void WriteRelocations(MCAssembler &Asm, MCAsmLayout &Layout) {
> > + for (MCAssembler::const_iterator it = Asm.begin(),
> > + ie = Asm.end(); it != ie; ++it) {
> > + const MCSectionData &SD = *it;
> > +
> > + WriteRelocation(Asm, Layout, SD);
>
> WriteRelocation(Asm, Layout, *it);
Fixed.
> > + void WriteSecHdrEntry(uint32_t Name, uint32_t Type, uint64_t Flags,
> > + uint64_t Address, uint64_t Offset,
> > + uint64_t Size, uint32_t Link, uint32_t Info,
> > + uint64_t Alignment, uint64_t EntrySize) {
> > + // Suppose that all non-initialized EntrySizes are actually a zero
> > + if (EntrySize == ~0U)
> > + EntrySize = 0;
>
> Messy... would it be possible to just set it correctly in the first place?
Yeah, I've reworked this because of your comments on the EntrySize
patch.
> > + F->getContents() += StringRef((const char *)&entry.r_offset, 8);
> > + F->getContents() += StringRef((const char *)&entry.r_info, 8);
>
> Is this endian-safe?
Nope. This is little endian specific. I've fixed this up now.
> > + // Write out the ELF header ...
> > + WriteHeader(SectionDataSize, NumSections);
> > + FileOff = Is64Bit ? 64 : 52;
>
> Again, names...
Done.
> > + // ... and then the section header table.
> > + // Should we align the section header table?
> > + //
> > + // Null secton first.
>
> *section
Fixed.
> > + switch(Section.getType()) {
> > + case ELF::SHT_DYNAMIC:
> > + sh_link = SectionStringTableIndex[&it->getSection()];
> > + sh_info = 0;
> > + break;
> > +
> > + case ELF::SHT_REL:
> > + case ELF::SHT_RELA:
> > + const MCSection *SymtabSection;
> > + const MCSection *InfoSection;
> > + const StringRef *SectionName;
> > + const MCSectionData *SymtabSD;
> > + const MCSectionData *InfoSD;
> > +
> > + SymtabSection = Asm.getContext().getELFSection(".symtab", ELF::SHT_SYMTAB, 0,
> > + SectionKind::getReadOnly(), false);
> > + SymtabSD = &Asm.getSectionData(*SymtabSection);
> > + // we have to count the empty section in too
> > + sh_link = SymtabSD->getLayoutOrder() + 1;
> > +
> > + SectionName = &Section.getSectionName();
> > + SectionName = &SectionName->slice(5, SectionName->size());
> > + InfoSection = Asm.getContext().getELFSection(*SectionName,
> > + ELF::SHT_PROGBITS, 0, SectionKind::getReadOnly(), false);
> > + InfoSD = &Asm.getSectionData(*InfoSection);
> > + sh_info = InfoSD->getLayoutOrder() + 1;
> > + break;
> > +
> > + case ELF::SHT_SYMTAB:
> > + case ELF::SHT_DYNSYM:
> > + sh_link = StringTableIndex;
> > + sh_info = LastLocalSymbolIndex;
> > + break;
> > +
> > + case ELF::SHT_HASH:
> > + case ELF::SHT_GROUP:
> > + case ELF::SHT_SYMTAB_SHNDX:
>
> "default:" ?
Sure, that seems sensible. Fixed.
> > + assert(0 && "FIXME: sh_type value not supported!");
> > + break;
> > + }
> > +
> > + WriteSecHdrEntry(SectionStringTableIndex[&it->getSection()],
> > + Section.getType(), Section.getFlags(),
> > + Layout.getSectionAddress(&SD),
> > + SectionOffsetMap.lookup(&SD.getSection()),
> > + Layout.getSectionSize(&SD), sh_link,
> > + sh_info, SD.getAlignment(),
> > + SD.getEntrySize());
>
> Indentation.
>
> +namespace {
> +
> +class MCELFStreamer : public MCStreamer {
> +private:
>
> Should inherit from MCObjectStreamer.
Fixed.
> > + MCFragment *getCurrentFragment() const {
> > + assert(CurSectionData && "No current section!");
> > +
> > + if (!CurSectionData->empty())
> > + return &CurSectionData->getFragmentList().back();
> > +
> > + return 0;
> > + }
>
> Shadows MCObjectStreamer::getCurrentFragment().
>
Deleted.
> > + /// fragment is not a data fragment.
> > + MCDataFragment *getOrCreateDataFragment() const {
> > + MCDataFragment *F = dyn_cast_or_null<MCDataFragment>(getCurrentFragment());
> > + if (!F)
> > + F = new MCDataFragment(CurSectionData);
> > + return F;
> > + }
>
> Shadows MCObjectStreamer::getOrCreateDataFragment().
Deleted.
> > + MCAssembler &getAssembler() { return Assembler; }
>
> Shadows MCObjectStreamer::getAssembler().
Deleted.
> > + const MCExpr *AddValueSymbols(const MCExpr *Value) {
> > + switch (Value->getKind()) {
> > + case MCExpr::Target: assert(0 && "Can't handle target exprs yet!");
> > + case MCExpr::Constant:
> > + break;
> > +
> > + case MCExpr::Binary: {
> > + const MCBinaryExpr *BE = cast<MCBinaryExpr>(Value);
> > + AddValueSymbols(BE->getLHS());
> > + AddValueSymbols(BE->getRHS());
> > + break;
> > + }
> > +
> > + case MCExpr::SymbolRef:
> > + Assembler.getOrCreateSymbolData(
> > + cast<MCSymbolRefExpr>(Value)->getSymbol());
> > + break;
> > +
> > + case MCExpr::Unary:
> > + AddValueSymbols(cast<MCUnaryExpr>(Value)->getSubExpr());
> > + break;
> > + }
> > +
> > + return Value;
> > + }
>
> Shadows MCObjectStreamer::AddValueSymbols.
Deleted.
> > +
> > + virtual void EmitELFSize(MCSymbol *Symbol, const MCExpr *Value) {
>
> Push definition outside of class defn.
I fixed a bug in this method and now it's only 2 lines long. Does the
definition still need to be pushed outside of the class definition?
> > + virtual void EmitDwarfFileDirective(unsigned FileNo, StringRef Filename) {
> > + errs() << "FIXME: MCELFStreamer:EmitDwarfFileDirective not implemented\n";
> > + }
>
> DEBUG(dbgs() << "")?
OK. Fixed.
> > + virtual void EmitInstruction(const MCInst &Inst);
> > + virtual void Finish();
> > +
> > + /// @}
> > +};
> > +
> > +} // end anonymous namespace.
> > +
> > +void MCELFStreamer::SwitchSection(const MCSection *Section) {
> > + assert(Section && "Cannot switch to a null section!");
> > +
> > + // If already in this section, then this is a noop.
> > + if (Section == CurSection) return;
> > +
> > + CurSection = Section;
> > + CurSectionData = &Assembler.getOrCreateSectionData(*Section);
> > +}
>
> Identical to MCObjectStreamer::SwitchSection.
Deleted.
> > +void MCELFStreamer::EmitAssignment(MCSymbol *Symbol, const MCExpr *Value) {
> > + // Only absolute symbols can be redefined.
> > + assert((Symbol->isUndefined() || Symbol->isAbsolute()) &&
> > + "Cannot define a symbol twice!");
> > +
> > + // FIXME: Lift context changes into super class.
> > + // FIXME: Set associated section.
> > + // Symbol->setValue(AddValueSymbols(Value));
> > +}
>
> Should this be different from MCMachOStreamer::EmitAssignment?
I don't think so. I've fixed this now.
> > +void MCELFStreamer::EmitSymbolDesc(MCSymbol *Symbol, unsigned DescValue) {
> > +}
>
> assert(0)? Or what is this supposed to be useful for?
I actually don't think the .desc directive makes sense for ELF. I've
deleted it.
> > +void MCELFStreamer::EmitCommonSymbol(MCSymbol *Symbol, uint64_t Size,
> > + unsigned ByteAlignment) {
> > + MCSymbolData &SD = Assembler.getOrCreateSymbolData(*Symbol);
> > +
> > + if ((SD.getFlags() & (0xff << ELF::STB_SHIFT)) == ELF::STB_LOCAL << ELF::STB_SHIFT) {
>
> STB_* is only 4 bits, so 0xf, no?
Correct. Fixed.
> > +void MCELFStreamer::EmitZerofill(const MCSection *Section, MCSymbol *Symbol,
> > + unsigned Size, unsigned ByteAlignment) {
>
> AFAIK, .zerofill is Darwin-only.
True. Deleted.
> > +void MCELFStreamer::EmitTBSSSymbol(const MCSection *Section, MCSymbol *Symbol,
> > + uint64_t Size, unsigned ByteAlignment) {
>
> Also Darwin only, AFAIK.
True. Deleted.
> > +void MCELFStreamer::EmitBytes(StringRef Data, unsigned AddrSpace) {
> > + getOrCreateDataFragment()->getContents().append(Data.begin(), Data.end());
> > +}
>
> Same as MCMachOStreamer::EmitBytes; toss in a TODO?
Done.
> > +void MCELFStreamer::EmitValue(const MCExpr *Value, unsigned Size,
> > + unsigned AddrSpace) {
> > + MCDataFragment *DF = getOrCreateDataFragment();
> > +
> > + // Avoid fixups when possible.
> > + int64_t AbsValue;
> > + if (AddValueSymbols(Value)->EvaluateAsAbsolute(AbsValue)) {
> > + // FIXME: Endianness assumption.
> > + for (unsigned i = 0; i != Size; ++i)
> > + DF->getContents().push_back(uint8_t(AbsValue >> (i * 8)));
> > + } else {
> > + DF->addFixup(MCFixup::Create(DF->getContents().size(), AddValueSymbols(Value),
> > + MCFixup::getKindForSize(Size)));
> > + DF->getContents().resize(DF->getContents().size() + Size, 0);
> > + }
> > +}
>
> Same as MCMachOStreamer::EmitValue; toss in a TODO?
Done.
> > +void MCELFStreamer::EmitValueToAlignment(unsigned ByteAlignment,
> > + int64_t Value, unsigned ValueSize,
> > + unsigned MaxBytesToEmit) {
> > + if (MaxBytesToEmit == 0)
> > + MaxBytesToEmit = ByteAlignment;
> > + new MCAlignFragment(ByteAlignment, Value, ValueSize, MaxBytesToEmit,
> > + CurSectionData);
> > +
> > + // Update the maximum alignment on the current section if necessary.
> > + if (ByteAlignment > CurSectionData->getAlignment())
> > + CurSectionData->setAlignment(ByteAlignment);
> > +}
>
> Same as MCMachOStreamer::EmitValueToAlignment; toss in a TODO?
Done.
> > +void MCELFStreamer::EmitCodeAlignment(unsigned ByteAlignment,
> > + unsigned MaxBytesToEmit) {
> > + if (MaxBytesToEmit == 0)
> > + MaxBytesToEmit = ByteAlignment;
> > + new MCAlignFragment(ByteAlignment, 0, 1, MaxBytesToEmit,
> > + CurSectionData);
> > +
> > + // Update the maximum alignment on the current section if necessary.
> > + if (ByteAlignment > CurSectionData->getAlignment())
> > + CurSectionData->setAlignment(ByteAlignment);
> > +}
>
> Same as MCMachOStreamer::EmitCodeAlignment; toss in a TODO?
Done.
> > +void MCELFStreamer::EmitValueToOffset(const MCExpr *Offset,
> > + unsigned char Value) {
> > + new MCOrgFragment(*Offset, Value, CurSectionData);
> > +}
>
> Same as MCMachOStreamer::EmitValueToOffset; toss in a TODO?
Done. I'll post the updated series sometime this week.
Thanks!
More information about the llvm-commits
mailing list