[PATCH] [MachO] MachOWriter generates bad indirect symbol tables when sections are split
David Fang
fang at csl.cornell.edu
Wed Mar 13 13:48:30 PDT 2013
Daniel,
The resulting LLVM test diffs with my patch are shown here:
http://llvm.org/bugs/show_bug.cgi?id=14636#c38
2 tests "failed" in MC/MachO: indirect-symbols.s and symbol-indirect.s
No other LLVM/clang tests regressed (Targets: PPC,X86,ARM).
The differences are that symbols are re-ordered, which accomplishes the
objective of the patch.
There are FIXME comments that require a maintainer's scrutiny.
Could you review these differences that result from my patch?
The patch as-is now is in proof-of-concept form. I'd be happy to revise
as you suggest.
If this patch benefits all Mach-O architectures (getting closer to usable
integrated assembly), I'd like to get this into trunk.
Fang
> Michael, Daniel,
> The attached patch I wrote works for me so far on my small test
> cases: llvm/clang's object file matches the indirect symbol ordering of the
> /usr/bin/as-assembled object's (tested on my powerpc-darwin8 branch).
> Michael, could you try it out on your own test cases?
> Daniel, could you kindly comment and review? Just the technical merits, not
> style.
> I've also attached the same patch to bug 14636.
>
> Thanks for taking a look.
>
> Fang
>
>> Daniel,
>> Some quick questions about your proposed recipe:
>>
>> > > Ok, here is what I think the right fix is. Instead of creating the
>> > > IndirectSymBase mapping we use to associate sections with their
>> > > indirect offset start, in BindIndirectSymbols() we should:
>> > >
>> > > 1. Add a simple container struct (lets say
>> > > MachSectionIndirectSymbols)
>> > > for tracking the per-section list of indirect symbols. It will keep
>> > > the list of symbols in the section and the index of the first
>> > > indirect
>> > > symbol in the section.
>>
>> Like a map<section*, vector<indirect_symbol> >?
>>
>> Instead of keeping track of the first indirect symbol index, could we just
>> as easily compute the index offsets later during the second pass:
>>
>> offset = 0;
>> for each section S (in order)
>> record offset for section S
>> [optional] write vector of indirect symbols for S
>> offset += S.num_indirect_symbols();
>> end for
>>
>> I was able to fix the reserve1 field by doing something like this (bug
>> 14636, comments 30-33), but I only later discovered that the indirect
>> symbols needed to be reordered.
>>
>> > > 2. Keep a mapping from sections to the above type.
>> > > 3. Add a SetVector to record the order of the sections that have
>> > > indirect symbols.
>>
>> This I understand, I found SetVector and SmallSetVector in ADT.
>>
>> > > 4. During BindIndirectSymbols() we maintain the above information
>> > > (populating the MachSectionIndirectSymbols per-section symbol
>> > > arrays).
>>
>> BindIndirectSymbols looks like it need not change, because it's already
>> operating on the assumption of ordered indirect symbols, right?
>>
>> > > 5. Once we have scanned all the symbols we make another pass over the
>> > > sections (in the order seen via indirect symbols) and assign the
>> > > start
>> > > indices.
>>
>> as I outlined above (offset = start indices).
>>
>> > > 6. Update writing of the indirect symbol table to write in the same
>> > > order as traversed in #5.
>> > >
>> > > Does that make sense? It's more work than your patch but it (a)
>> > > should
>> > > preserve binary compatibility with 'as' in situations where the
>> > > indirect symbols don't appear out of order w.r.t. the sections, (b)
>> > > it
>> > > makes somewhere more explicit the relationship between sections and
>> > > their list of symbols being in contiguous order.
>> > >
>> > > - Daniel
>>
>> Will this re-ordering be acceptable to all Mach-O back-ends? x86, ARM,
>> PPC?
>>
>> Fang
>>
>>
>
>
--
David Fang
http://www.csl.cornell.edu/~fang/
More information about the llvm-commits
mailing list