[PATCH] [MachO] MachOWriter generates bad indirect symbol tables when sections are split

David Fang fang at csl.cornell.edu
Thu Mar 14 10:26:18 PDT 2013


Sure, Daniel, you seem pretty active, I can ping again in a week.

patch notes: The patch is written so that changing a single #if in 
MCAssembler.h at the top will effectively revert it (for ease of testing). 
A few typedefs for the new structures in MCAssembler are made public just 
for convenience.  It looks like I'm deep in FIXME territory, judging by 
the comments in the touched files.  Feel free to pass a revised patch back 
to me as you see fit.

Fang

> Hi David,
>
> Thanks for working on a patch, I'll take a look but it may take me a couple
> days to get to it.
>
> - Daniel
>
>
> On Wed, Mar 13, 2013 at 1:48 PM, David Fang <fang at csl.cornell.edu> wrote:
>
>> Daniel,
>>         The resulting LLVM test diffs with my patch are shown here:
>> http://llvm.org/bugs/show_bug.**cgi?id=14636#c38<http://llvm.org/bugs/show_bug.cgi?id=14636#c38>
>>
>> 2 tests "failed" in MC/MachO: indirect-symbols.s and symbol-indirect.s
>> No other LLVM/clang tests regressed (Targets: PPC,X86,ARM).
>>
>> The differences are that symbols are re-ordered, which accomplishes the
>> objective of the patch. There are FIXME comments that require a
>> maintainer's scrutiny.
>> Could you review these differences that result from my patch?
>> The patch as-is now is in proof-of-concept form.  I'd be happy to revise
>> as you suggest.
>> If this patch benefits all Mach-O architectures (getting closer to usable
>> integrated assembly), I'd like to get this into trunk.
>>
>> Fang
>>
>>
>>  Michael, Daniel,
>>>         The attached patch I wrote works for me so far on my small test
>>> cases: llvm/clang's object file matches the indirect symbol ordering of the
>>> /usr/bin/as-assembled object's (tested on my powerpc-darwin8 branch).
>>> Michael, could you try it out on your own test cases?
>>> Daniel, could you kindly comment and review?  Just the technical merits,
>>> not style.
>>>          I've also attached the same patch to bug 14636.
>>>
>>> Thanks for taking a look.
>>>
>>> Fang
>>>
>>>   Daniel,
>>>>    Some quick questions about your proposed recipe:
>>>>
>>>>>>   Ok, here is what I think the right fix is. Instead of creating the
>>>>>>   IndirectSymBase mapping we use to associate sections with their
>>>>>>   indirect offset start, in BindIndirectSymbols() we should:
>>>>>>>>   1. Add a simple container struct (lets say > >
>>>> MachSectionIndirectSymbols)
>>>>>>   for tracking the per-section list of indirect symbols. It will keep
>>>>>>   the list of symbols in the section and the index of the first > >
>>>>   indirect
>>>>>>   symbol in the section.
>>>>
>>>>  Like a map<section*, vector<indirect_symbol> >?
>>>>
>>>>  Instead of keeping track of the first indirect symbol index, could we
>>>> just
>>>>  as easily compute the index offsets later during the second pass:
>>>>
>>>>  offset = 0;
>>>>  for each section S (in order)
>>>>    record offset for section S
>>>>    [optional] write vector of indirect symbols for S
>>>>    offset += S.num_indirect_symbols();
>>>>  end for
>>>>
>>>>  I was able to fix the reserve1 field by doing something like this (bug
>>>>  14636, comments 30-33), but I only later discovered that the indirect
>>>>  symbols needed to be reordered.
>>>>
>>>>>>   2. Keep a mapping from sections to the above type.
>>>>>>   3. Add a SetVector to record the order of the sections that have
>>>>>>   indirect symbols.
>>>>
>>>>  This I understand, I found SetVector and SmallSetVector in ADT.
>>>>
>>>>>>   4. During BindIndirectSymbols() we maintain the above information
>>>>>>   (populating the MachSectionIndirectSymbols per-section symbol > >
>>>>   arrays).
>>>>
>>>>  BindIndirectSymbols looks like it need not change, because it's already
>>>>  operating on the assumption of ordered indirect symbols, right?
>>>>
>>>>>>   5. Once we have scanned all the symbols we make another pass over
>>>> the
>>>>>>   sections (in the order seen via indirect symbols) and assign the >
>>>>>   start
>>>>>>   indices.
>>>>
>>>>  as I outlined above (offset = start indices).
>>>>
>>>>>>   6. Update writing of the indirect symbol table to write in the same
>>>>>>   order as traversed in #5.
>>>>>>>>   Does that make sense? It's more work than your patch but it
>>>> (a) > >   should
>>>>>>   preserve binary compatibility with 'as' in situations where the
>>>>>>   indirect symbols don't appear out of order w.r.t. the sections,
>>>> (b) > >   it
>>>>>>   makes somewhere more explicit the relationship between sections and
>>>>>>   their list of symbols being in contiguous order.
>>>>>>>>   - Daniel
>>>>
>>>>  Will this re-ordering be acceptable to all Mach-O back-ends?  x86, ARM,
>>>>  PPC?
>>>>
>>>>  Fang
>>>>
>>>>
>>>>
>>>
>>>
>> --
>> David Fang
>> http://www.csl.cornell.edu/~**fang/ <http://www.csl.cornell.edu/~fang/>
>>
>>
>

-- 
David Fang
http://www.csl.cornell.edu/~fang/




More information about the llvm-commits mailing list