[PATCH] [MachO] MachOWriter generates bad indirect symbol tables when sections are split

Daniel Dunbar daniel at zuster.org
Mon Mar 18 23:41:31 PDT 2013


Hi David,

Ok, here are my main high level comments on the patch. I have some specific
nits too but I'll save them for later:

#1. I would rather this patch not require modifying the MCAssembler level
interface. Instead, what I would like is for the new data structures
(IndirectSymbols and IndirectSymbolSections) to get built up inside
BindIndirectSymbols().

#2. I don't think this patch should actually cause test cases to change.
Most of those test cases were generated based on matching existing 'as'
output for x86, and the way I envisioned this patch working I wouldn't have
expected it to change the order of symbols except in the case where the
indirect symbols would be out of order.

I believe that if you fix #1 then #2 will fall out, because the indirect
symbol lists will get built up in the same order as they are now (based on
iterating the indirect symbol list twice).

 - Daniel


On Thu, Mar 14, 2013 at 10:26 AM, David Fang <fang at csl.cornell.edu> wrote:

> Sure, Daniel, you seem pretty active, I can ping again in a week.
>
> patch notes: The patch is written so that changing a single #if in
> MCAssembler.h at the top will effectively revert it (for ease of testing).
> A few typedefs for the new structures in MCAssembler are made public just
> for convenience.  It looks like I'm deep in FIXME territory, judging by the
> comments in the touched files.  Feel free to pass a revised patch back to
> me as you see fit.
>
> Fang
>
>  Hi David,
>>
>> Thanks for working on a patch, I'll take a look but it may take me a
>> couple
>> days to get to it.
>>
>> - Daniel
>>
>>
>> On Wed, Mar 13, 2013 at 1:48 PM, David Fang <fang at csl.cornell.edu> wrote:
>>
>>  Daniel,
>>>         The resulting LLVM test diffs with my patch are shown here:
>>> http://llvm.org/bugs/show_bug.****cgi?id=14636#c38<http://llvm.org/bugs/show_bug.**cgi?id=14636#c38>
>>> <http://**llvm.org/bugs/show_bug.cgi?id=**14636#c38<http://llvm.org/bugs/show_bug.cgi?id=14636#c38>
>>> >
>>>
>>>
>>> 2 tests "failed" in MC/MachO: indirect-symbols.s and symbol-indirect.s
>>> No other LLVM/clang tests regressed (Targets: PPC,X86,ARM).
>>>
>>> The differences are that symbols are re-ordered, which accomplishes the
>>> objective of the patch. There are FIXME comments that require a
>>> maintainer's scrutiny.
>>> Could you review these differences that result from my patch?
>>> The patch as-is now is in proof-of-concept form.  I'd be happy to revise
>>> as you suggest.
>>> If this patch benefits all Mach-O architectures (getting closer to usable
>>> integrated assembly), I'd like to get this into trunk.
>>>
>>> Fang
>>>
>>>
>>>  Michael, Daniel,
>>>
>>>>         The attached patch I wrote works for me so far on my small test
>>>> cases: llvm/clang's object file matches the indirect symbol ordering of
>>>> the
>>>> /usr/bin/as-assembled object's (tested on my powerpc-darwin8 branch).
>>>> Michael, could you try it out on your own test cases?
>>>> Daniel, could you kindly comment and review?  Just the technical merits,
>>>> not style.
>>>>          I've also attached the same patch to bug 14636.
>>>>
>>>> Thanks for taking a look.
>>>>
>>>> Fang
>>>>
>>>>   Daniel,
>>>>
>>>>>    Some quick questions about your proposed recipe:
>>>>>
>>>>>    Ok, here is what I think the right fix is. Instead of creating the
>>>>>>>   IndirectSymBase mapping we use to associate sections with their
>>>>>>>   indirect offset start, in BindIndirectSymbols() we should:
>>>>>>>
>>>>>>>>   1. Add a simple container struct (lets say > >
>>>>>>>>>
>>>>>>>> MachSectionIndirectSymbols)
>>>>>
>>>>>>   for tracking the per-section list of indirect symbols. It will keep
>>>>>>>   the list of symbols in the section and the index of the first > >
>>>>>>>
>>>>>>   indirect
>>>>>
>>>>>>   symbol in the section.
>>>>>>>
>>>>>>
>>>>>  Like a map<section*, vector<indirect_symbol> >?
>>>>>
>>>>>  Instead of keeping track of the first indirect symbol index, could we
>>>>> just
>>>>>  as easily compute the index offsets later during the second pass:
>>>>>
>>>>>  offset = 0;
>>>>>  for each section S (in order)
>>>>>    record offset for section S
>>>>>    [optional] write vector of indirect symbols for S
>>>>>    offset += S.num_indirect_symbols();
>>>>>  end for
>>>>>
>>>>>  I was able to fix the reserve1 field by doing something like this (bug
>>>>>  14636, comments 30-33), but I only later discovered that the indirect
>>>>>  symbols needed to be reordered.
>>>>>
>>>>>    2. Keep a mapping from sections to the above type.
>>>>>>>   3. Add a SetVector to record the order of the sections that have
>>>>>>>   indirect symbols.
>>>>>>>
>>>>>>
>>>>>  This I understand, I found SetVector and SmallSetVector in ADT.
>>>>>
>>>>>    4. During BindIndirectSymbols() we maintain the above information
>>>>>>>   (populating the MachSectionIndirectSymbols per-section symbol > >
>>>>>>>
>>>>>>   arrays).
>>>>>
>>>>>  BindIndirectSymbols looks like it need not change, because it's
>>>>> already
>>>>>  operating on the assumption of ordered indirect symbols, right?
>>>>>
>>>>>    5. Once we have scanned all the symbols we make another pass over
>>>>>>>
>>>>>> the
>>>>>
>>>>>>   sections (in the order seen via indirect symbols) and assign the >
>>>>>>>
>>>>>>   start
>>>>>>
>>>>>>>   indices.
>>>>>>>
>>>>>>
>>>>>  as I outlined above (offset = start indices).
>>>>>
>>>>>    6. Update writing of the indirect symbol table to write in the same
>>>>>>>   order as traversed in #5.
>>>>>>>
>>>>>>>>   Does that make sense? It's more work than your patch but it
>>>>>>>>>
>>>>>>>> (a) > >   should
>>>>>
>>>>>>   preserve binary compatibility with 'as' in situations where the
>>>>>>>   indirect symbols don't appear out of order w.r.t. the sections,
>>>>>>>
>>>>>> (b) > >   it
>>>>>
>>>>>>   makes somewhere more explicit the relationship between sections and
>>>>>>>   their list of symbols being in contiguous order.
>>>>>>>
>>>>>>>>   - Daniel
>>>>>>>>>
>>>>>>>>
>>>>>  Will this re-ordering be acceptable to all Mach-O back-ends?  x86,
>>>>> ARM,
>>>>>  PPC?
>>>>>
>>>>>  Fang
>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>>>  --
>>> David Fang
>>> http://www.csl.cornell.edu/~****fang/<http://www.csl.cornell.edu/~**fang/><
>>> http://www.csl.cornell.edu/~**fang/ <http://www.csl.cornell.edu/~fang/>>
>>>
>>>
>>>
>>
> --
> David Fang
> http://www.csl.cornell.edu/~**fang/ <http://www.csl.cornell.edu/~fang/>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20130318/b9e4995f/attachment.html>


More information about the llvm-commits mailing list