[PATCH] [MachO] MachOWriter generates bad indirect symbol tables when sections are split

David Fang fang at csl.cornell.edu
Wed Mar 13 13:48:30 PDT 2013


Daniel,
 	The resulting LLVM test diffs with my patch are shown here:
http://llvm.org/bugs/show_bug.cgi?id=14636#c38

2 tests "failed" in MC/MachO: indirect-symbols.s and symbol-indirect.s
No other LLVM/clang tests regressed (Targets: PPC,X86,ARM).

The differences are that symbols are re-ordered, which accomplishes the 
objective of the patch. 
There are FIXME comments that require a maintainer's scrutiny.
Could you review these differences that result from my patch?
The patch as-is now is in proof-of-concept form.  I'd be happy to revise 
as you suggest.
If this patch benefits all Mach-O architectures (getting closer to usable 
integrated assembly), I'd like to get this into trunk.

Fang

> Michael, Daniel,
> 	The attached patch I wrote works for me so far on my small test 
> cases: llvm/clang's object file matches the indirect symbol ordering of the 
> /usr/bin/as-assembled object's (tested on my powerpc-darwin8 branch).
> Michael, could you try it out on your own test cases?
> Daniel, could you kindly comment and review?  Just the technical merits, not 
> style.
> 	 I've also attached the same patch to bug 14636.
>
> Thanks for taking a look.
>
> Fang
>
>>  Daniel,
>>    Some quick questions about your proposed recipe:
>> 
>> > >   Ok, here is what I think the right fix is. Instead of creating the
>> > >   IndirectSymBase mapping we use to associate sections with their
>> > >   indirect offset start, in BindIndirectSymbols() we should:
>> > > 
>> > >   1. Add a simple container struct (lets say 
>> > >   MachSectionIndirectSymbols)
>> > >   for tracking the per-section list of indirect symbols. It will keep
>> > >   the list of symbols in the section and the index of the first 
>> > >   indirect
>> > >   symbol in the section.
>>
>>  Like a map<section*, vector<indirect_symbol> >?
>>
>>  Instead of keeping track of the first indirect symbol index, could we just
>>  as easily compute the index offsets later during the second pass:
>>
>>  offset = 0;
>>  for each section S (in order)
>>    record offset for section S
>>    [optional] write vector of indirect symbols for S
>>    offset += S.num_indirect_symbols();
>>  end for
>>
>>  I was able to fix the reserve1 field by doing something like this (bug
>>  14636, comments 30-33), but I only later discovered that the indirect
>>  symbols needed to be reordered.
>> 
>> > >   2. Keep a mapping from sections to the above type.
>> > >   3. Add a SetVector to record the order of the sections that have
>> > >   indirect symbols.
>>
>>  This I understand, I found SetVector and SmallSetVector in ADT.
>> 
>> > >   4. During BindIndirectSymbols() we maintain the above information
>> > >   (populating the MachSectionIndirectSymbols per-section symbol 
>> > >   arrays).
>>
>>  BindIndirectSymbols looks like it need not change, because it's already
>>  operating on the assumption of ordered indirect symbols, right?
>> 
>> > >   5. Once we have scanned all the symbols we make another pass over the
>> > >   sections (in the order seen via indirect symbols) and assign the 
>> > >   start
>> > >   indices.
>>
>>  as I outlined above (offset = start indices).
>> 
>> > >   6. Update writing of the indirect symbol table to write in the same
>> > >   order as traversed in #5.
>> > > 
>> > >   Does that make sense? It's more work than your patch but it (a) 
>> > >   should
>> > >   preserve binary compatibility with 'as' in situations where the
>> > >   indirect symbols don't appear out of order w.r.t. the sections, (b) 
>> > >   it
>> > >   makes somewhere more explicit the relationship between sections and
>> > >   their list of symbols being in contiguous order.
>> > > 
>> > >   - Daniel
>>
>>  Will this re-ordering be acceptable to all Mach-O back-ends?  x86, ARM,
>>  PPC?
>>
>>  Fang
>> 
>> 
>
>

-- 
David Fang
http://www.csl.cornell.edu/~fang/




More information about the llvm-commits mailing list