[PATCH] [MachO] MachOWriter generates bad indirect symbol tables when sections are split

Daniel Dunbar daniel at zuster.org
Tue Mar 19 10:18:10 PDT 2013


On Tue, Mar 19, 2013 at 2:27 AM, David Fang <fang at csl.cornell.edu> wrote:

> Hi Daniel,
>         Thanks for reviewing the patch.  I thought I tried suggestion #1
> before this patch in http://llvm.org/bugs/show_bug.**cgi?id=14636#c33<http://llvm.org/bugs/show_bug.cgi?id=14636#c33>,
> which only reordeed IndirectSymBase without reordering WriteObject, and
> didn't quite produce the desired result.  I may revisit it again.
>         If I reorder IndirectSymBase, I need to also reorder in
> WriteObject, so am I allowed to move the IndirectSymbol (mapped by section)
> and IndirectSymbolSection (order of first appearance) structures out of
> MCAssembler into MachObjectWriter?  I need the ordering information from
> the first pass of BindIndirectSymbols to persist until WriteObject, without
> having to sort twice.  This transformation is intended for X86,PPC
> (probably ARM as well) +mach-o back-ends after all.
>

I don't understand what the issue is here, you can of course add extra data
structures into the MachObjectWriter class and compute them during
BindIndirectSymbols() and reuse them during WriteObject(). You shouldn't
need to change anything in the MCAssembler.

Basically I am just suggesting moving the building up of the Mach-O
specific data structures into BindIndirectSymbols() and the
MachObjectWriter class.

 - Daniel


>
> Fang
>
>  Hi David,
>>
>> Ok, here are my main high level comments on the patch. I have some
>> specific
>> nits too but I'll save them for later:
>>
>> #1. I would rather this patch not require modifying the MCAssembler level
>> interface. Instead, what I would like is for the new data structures
>> (IndirectSymbols and IndirectSymbolSections) to get built up inside
>> BindIndirectSymbols().
>>
>> #2. I don't think this patch should actually cause test cases to change.
>> Most of those test cases were generated based on matching existing 'as'
>> output for x86, and the way I envisioned this patch working I wouldn't
>> have
>> expected it to change the order of symbols except in the case where the
>> indirect symbols would be out of order.
>>
>> I believe that if you fix #1 then #2 will fall out, because the indirect
>> symbol lists will get built up in the same order as they are now (based on
>> iterating the indirect symbol list twice).
>>
>> - Daniel
>>
>>
>> On Thu, Mar 14, 2013 at 10:26 AM, David Fang <fang at csl.cornell.edu>
>> wrote:
>>
>>  Sure, Daniel, you seem pretty active, I can ping again in a week.
>>>
>>> patch notes: The patch is written so that changing a single #if in
>>> MCAssembler.h at the top will effectively revert it (for ease of
>>> testing).
>>> A few typedefs for the new structures in MCAssembler are made public just
>>> for convenience.  It looks like I'm deep in FIXME territory, judging by
>>> the
>>> comments in the touched files.  Feel free to pass a revised patch back to
>>> me as you see fit.
>>>
>>> Fang
>>>
>>>  Hi David,
>>>
>>>>
>>>> Thanks for working on a patch, I'll take a look but it may take me a
>>>> couple
>>>> days to get to it.
>>>>
>>>> - Daniel
>>>>
>>>>
>>>> On Wed, Mar 13, 2013 at 1:48 PM, David Fang <fang at csl.cornell.edu>
>>>> wrote:
>>>>
>>>>  Daniel,
>>>>
>>>>>         The resulting LLVM test diffs with my patch are shown here:
>>>>> http://llvm.org/bugs/show_bug.******cgi?id=14636#c38<http://llvm.org/bugs/show_bug.****cgi?id=14636#c38>
>>>>> <http://**llvm.org/bugs/show_bug.**cgi?**id=14636#c38<http://llvm.org/bugs/show_bug.**cgi?id=14636#c38>
>>>>> >
>>>>> <http://**llvm.org/bugs/show_**bug.cgi?id=**14636#c38<http://llvm.org/bugs/show_bug.cgi?id=**14636#c38>
>>>>> <http://**llvm.org/bugs/show_bug.cgi?id=**14636#c38<http://llvm.org/bugs/show_bug.cgi?id=14636#c38>
>>>>> >
>>>>>
>>>>>
>>>>>>
>>>>>
>>>>> 2 tests "failed" in MC/MachO: indirect-symbols.s and symbol-indirect.s
>>>>> No other LLVM/clang tests regressed (Targets: PPC,X86,ARM).
>>>>>
>>>>> The differences are that symbols are re-ordered, which accomplishes the
>>>>> objective of the patch. There are FIXME comments that require a
>>>>> maintainer's scrutiny.
>>>>> Could you review these differences that result from my patch?
>>>>> The patch as-is now is in proof-of-concept form.  I'd be happy to
>>>>> revise
>>>>> as you suggest.
>>>>> If this patch benefits all Mach-O architectures (getting closer to
>>>>> usable
>>>>> integrated assembly), I'd like to get this into trunk.
>>>>>
>>>>> Fang
>>>>>
>>>>>
>>>>>  Michael, Daniel,
>>>>>
>>>>>          The attached patch I wrote works for me so far on my small
>>>>>> test
>>>>>> cases: llvm/clang's object file matches the indirect symbol ordering
>>>>>> of
>>>>>> the
>>>>>> /usr/bin/as-assembled object's (tested on my powerpc-darwin8 branch).
>>>>>> Michael, could you try it out on your own test cases?
>>>>>> Daniel, could you kindly comment and review?  Just the technical
>>>>>> merits,
>>>>>> not style.
>>>>>>          I've also attached the same patch to bug 14636.
>>>>>>
>>>>>> Thanks for taking a look.
>>>>>>
>>>>>> Fang
>>>>>>
>>>>>>   Daniel,
>>>>>>
>>>>>>     Some quick questions about your proposed recipe:
>>>>>>>
>>>>>>>    Ok, here is what I think the right fix is. Instead of creating the
>>>>>>>
>>>>>>>>   IndirectSymBase mapping we use to associate sections with their
>>>>>>>>>   indirect offset start, in BindIndirectSymbols() we should:
>>>>>>>>>
>>>>>>>>>    1. Add a simple container struct (lets say > >
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>  MachSectionIndirectSymbols)
>>>>>>>>>>
>>>>>>>>>
>>>>>>>    for tracking the per-section list of indirect symbols. It will
>>>>>>>> keep
>>>>>>>>
>>>>>>>>>   the list of symbols in the section and the index of the first > >
>>>>>>>>>
>>>>>>>>>    indirect
>>>>>>>>
>>>>>>>
>>>>>>>    symbol in the section.
>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>   Like a map<section*, vector<indirect_symbol> >?
>>>>>>>
>>>>>>>  Instead of keeping track of the first indirect symbol index, could
>>>>>>> we
>>>>>>> just
>>>>>>>  as easily compute the index offsets later during the second pass:
>>>>>>>
>>>>>>>  offset = 0;
>>>>>>>  for each section S (in order)
>>>>>>>    record offset for section S
>>>>>>>    [optional] write vector of indirect symbols for S
>>>>>>>    offset += S.num_indirect_symbols();
>>>>>>>  end for
>>>>>>>
>>>>>>>  I was able to fix the reserve1 field by doing something like this
>>>>>>> (bug
>>>>>>>  14636, comments 30-33), but I only later discovered that the
>>>>>>> indirect
>>>>>>>  symbols needed to be reordered.
>>>>>>>
>>>>>>>    2. Keep a mapping from sections to the above type.
>>>>>>>
>>>>>>>>   3. Add a SetVector to record the order of the sections that have
>>>>>>>>>   indirect symbols.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>   This I understand, I found SetVector and SmallSetVector in ADT.
>>>>>>>
>>>>>>>    4. During BindIndirectSymbols() we maintain the above information
>>>>>>>
>>>>>>>>   (populating the MachSectionIndirectSymbols per-section symbol > >
>>>>>>>>>
>>>>>>>>>    arrays).
>>>>>>>>
>>>>>>>
>>>>>>>  BindIndirectSymbols looks like it need not change, because it's
>>>>>>> already
>>>>>>>  operating on the assumption of ordered indirect symbols, right?
>>>>>>>
>>>>>>>    5. Once we have scanned all the symbols we make another pass over
>>>>>>>
>>>>>>>>
>>>>>>>>>  the
>>>>>>>>
>>>>>>>
>>>>>>>    sections (in the order seen via indirect symbols) and assign the >
>>>>>>>>
>>>>>>>>>
>>>>>>>>>    start
>>>>>>>>
>>>>>>>>    indices.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>   as I outlined above (offset = start indices).
>>>>>>>
>>>>>>>    6. Update writing of the indirect symbol table to write in the
>>>>>>> same
>>>>>>>
>>>>>>>>   order as traversed in #5.
>>>>>>>>>
>>>>>>>>>    Does that make sense? It's more work than your patch but it
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>  (a) > >   should
>>>>>>>>>>
>>>>>>>>>
>>>>>>>    preserve binary compatibility with 'as' in situations where the
>>>>>>>>
>>>>>>>>>   indirect symbols don't appear out of order w.r.t. the sections,
>>>>>>>>>
>>>>>>>>>  (b) > >   it
>>>>>>>>
>>>>>>>
>>>>>>>    makes somewhere more explicit the relationship between sections
>>>>>>>> and
>>>>>>>>
>>>>>>>>>   their list of symbols being in contiguous order.
>>>>>>>>>
>>>>>>>>>    - Daniel
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>   Will this re-ordering be acceptable to all Mach-O back-ends?
>>>>>>>  x86,
>>>>>>> ARM,
>>>>>>>  PPC?
>>>>>>>
>>>>>>>  Fang
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>  --
>>>>>>
>>>>> David Fang
>>>>> http://www.csl.cornell.edu/~******fang/<http://www.csl.cornell.edu/~****fang/>
>>>>> <http://www.csl.**cornell.edu/~**fang/<http://www.csl.cornell.edu/~**fang/>
>>>>> ><
>>>>> http://www.csl.cornell.edu/~****fang/<http://www.csl.cornell.edu/~**fang/><
>>>>> http://www.csl.cornell.edu/~**fang/<http://www.csl.cornell.edu/~fang/>
>>>>> >>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>  --
>>> David Fang
>>> http://www.csl.cornell.edu/~****fang/<http://www.csl.cornell.edu/~**fang/><
>>> http://www.csl.cornell.edu/~**fang/ <http://www.csl.cornell.edu/~fang/>>
>>>
>>>
>>>
>>
> --
> David Fang
> http://www.csl.cornell.edu/~**fang/ <http://www.csl.cornell.edu/~fang/>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20130319/28954675/attachment.html>


More information about the llvm-commits mailing list